Recently, many authors have shown that the solution cost for various kinds of combinatorial search problems follows a pattern of easy-hard-easy as a function of how tightly constrained the problems are. For example, this pattern appears for graph coloring as a function of the average graph connectivity [4, 12], for propositional satisfiability (SAT) as a function of the ratio of number of clauses to number of variables [4, 16, 5, 8], and for constraint satisfaction problems (CSPs) as a function of the number of nogoods [21] and constraint tightness [18, 17].
This regularity raises the possibility of determining, prior to search, the likely difficulty of problems. Unfortunately, this is not yet possible because of the high variance associated with the observations. This is compounded by the fact that a single problem can be viewed as belonging to a variety of problem classes, each with somewhat different transition points. Thus one important direction for improvement is to investigate whether there are simple additional parameters that can reduce this variance and allow predictions with higher confidence.
One approach to this question is based on the explanation of the easy-hard-easy pattern as a competition between changes in the number of solutions and pruning of unproductive search paths as a function of some measure of the degree to which the problems are constrained. In particular this predicts that problems with many solutions tend to be easier, on average, than those with fewer for a given number of constraints. Thus, at least one aspect of the high variance in search cost appears to be due to the variance in number of solutions in the problems of a fixed degree of constraint. This observation has motivated the introduction of additional parameters describing problem structure based on a more precise specification of the number of solutions [11].
In this paper we investigate the generality of this explanation by examining problems for which the number of solutions is restricted, including cases where the number is specified exactly to be either zero or one. If the peak in search cost in fact arises generally from a competition between changes in the number of solutions and pruning, cases with a fixed number of solutions should not show a peak. However, we find that a peak continues to appear in these cases for some sophisticated search algorithms, while it fails to appear in other cases. This calls into question the generality of the explanation based on number of solutions, and also suggests that a search for additional problem structure parameters based solely on reducing the variance in the number of solutions is not likely to be sufficient to accurately predict search cost. However, some structural aspect of problems is likely to be involved. Specifically, we present data showing that the smallest of the problem's minimal unsolvable subproblems correlates well with search cost.
In the next section we describe some classes of search problems. We then review the pattern of search behavior and the current theoretical explanation for it. In the following section we uncover some limitations of this explanation by examining problems with some specification on their number of solutions. This shows the easy-hard-easy pattern is a more general phenomenon than suggested by current explanations. We then suggest an alternative explanation related to problem structure, and present data for unsolvable problems showing a positive relationship between this problem structure parameter, the minimum size of minimal unsolvable subproblem, and search cost. This same problem structure parameter may explain differences in search cost among solvable problems with equal numbers of solutions, as well. Finally, we discuss some of the implications of these observations and make suggestions for obtaining a better understanding and greater predictability for hard search problems.
Next: 2. Some Classes of Search
Problems
Return to Contents
Send comments to mammen@cs.umass.edu