Software assistance is already needed to deal with the growing flood of information available on the WWW. The design of WebWatcher is based on the assumption that knowledge about how to search the web can be learned by interactively assisting and watching searches performed by humans. If successful, different copies of WebWatcher could easily be attached to any web page for which a specialized search assistant would be useful. Over time, each copy could learn expertise specializing in the types of users, information needs, and information sources commonly encountered through its page.
In the preliminary learning experiments reported here, WebWatcher was able to learn search control knowledge that approximately predicts the hyperlink selected by users, conditional on the current page, link, and goal. These experiments also showed that the accuracy of the agent's advice can be increased by allowing it to give advice only when it has high confidence. While these experimental results are positive, they are based on a small number of training sessions, searching for a particular type of information, from a specific web page. We do not yet know whether the results reported here are representative of what can be expected for other search goals, users, and web localities.
Based on our initial exploration, we are optimistic that a learning apprentice for the world wide web is feasible. Although learned knowledge may provide only imperfect advice, even a modest reduction in the number of hyperlinks considered at each page leads to an exponential improvement in the overall search. Moreover, we believe learning can be made more effective by taking advantage of the abundant data available from many users on the web, and by considering methods beyond those reported here.
For additional information, see the WebWatcher project page,
http://www.cs.cmu.edu:8001/afs/cs.cmu.edu
/project/theo-6/web-agent/www/project-home.html
.