Some users may prefer that the agent provide more accurate advice, even if this requires that it make recommendations more sparingly. To determine the feasibility of increasing advice accuracy by reducing coverage, we experimented with adding a threshold on the confidence of the advice. For each of the learning methods considered here, the learner's output is a real--valued number that can be used to estimate its confidence in recommending the link. Therefore, it is easy to introduce a confidence threshold in each of these cases.
Figure: Increasing accuracy by reducing coverage. The vertical axis
indicates the fraction of test pages for which the learner's top
recommendation was taken by the user. The horizontal axis indicates the
fraction of test cases covered by advice as the confidence threshold is varied
from high confidence (left) to low (right).
Figure 7 shows how advice accuracy varies with coverage, as the confidence threshold is varied. For high values of the confidence threshold, the agent provides advice less often, but can usually achieve higher accuracy. In this case, accuracy is measured by the fraction of test cases for which the learner's top ranked hyperlink is the link selected by the user. Thus, the rightmost points in the plots of figure 7 correspond exactly to the leftmost plots in figure 6 (i.e., 100% coverage).
Notice that the accuracy of Winnow
's top-ranked recommendation
increases from 30% to 53% as its coverage is decreased to a more selective
10% of the cases. Interestingly, while
Wordstat
's advice is relatively accurate in general,
its accuracy degrades drastically at higher thresholds. The presence of
features which occur very infrequently in the training set, resulting in poor
probability estimates, and the inter-feature independence assumption, which
the training set by no means justifies, appear to account for this phenomenon.