Can Accuracy be Improved by Sacrificing Coverage?

Next: Conclusions Up: Results Previous: How Accurately Can

Can Accuracy be Improved by Sacrificing Coverage?

Some users may prefer that the agent provide more accurate advice, even if this requires that it make recommendations more sparingly. To determine the feasibility of increasing advice accuracy by reducing coverage, we experimented with adding a threshold on the confidence of the advice. For each of the learning methods considered here, the learner's output is a real--valued number that can be used to estimate its confidence in recommending the link. Therefore, it is easy to introduce a confidence threshold in each of these cases.

Figure: Increasing accuracy by reducing coverage. The vertical axis indicates the fraction of test pages for which the learner's top recommendation was taken by the user. The horizontal axis indicates the fraction of test cases covered by advice as the confidence threshold is varied from high confidence (left) to low (right).

Figure 7 shows how advice accuracy varies with coverage, as the confidence threshold is varied. For high values of the confidence threshold, the agent provides advice less often, but can usually achieve higher accuracy. In this case, accuracy is measured by the fraction of test cases for which the learner's top ranked hyperlink is the link selected by the user. Thus, the rightmost points in the plots of figure 7 correspond exactly to the leftmost plots in figure 6 (i.e., 100% coverage).

Notice that the accuracy of Winnow's top-ranked recommendation increases from 30% to 53% as its coverage is decreased to a more selective 10% of the cases. Interestingly, while Wordstat's advice is relatively accurate in general, its accuracy degrades drastically at higher thresholds. The presence of features which occur very infrequently in the training set, resulting in poor probability estimates, and the inter-feature independence assumption, which the training set by no means justifies, appear to account for this phenomenon.

Next: Conclusions Up: Results Previous: How Accurately Can

Thorsten Joachims
Thu Feb 9 16:27:34 EST 1995