Generating problems with a prespecified solution as described above is commonly used to study search problems. However, for each variable there are more allowable clauses where the variable is assigned its bad value than its good value. This makes highly constrained instances particularly easy since the good value for each variable can often be determined from its assigned value that appears most often in the clauses [24].
This bias in clause selection can be removed by a slight change in the generation method [53]. Specifically, instead of only avoiding those clauses that conflict with the prespecified solution, i.e., specify zero bad values, we also avoid any clauses that have an even number of bad values with respect to the prespecified solution. This selection method means both values for each variable appear equally often among the clauses. These balanced problems can have at most
clauses. Furthermore, an assignment with j bad values can have at most
conflicts, where the sum is over odd values of i. Using these values in Eq. (8) instead of and gives the maximum likelihood estimate for j in this ``balanced clause'' ensemble conditioned on the number of conflicts in the assignment.