Discussion

Next: Related Research Up: System Evaluation Previous: Experimental Results

Discussion

In summary, our experiment showed that the ADAPTIVE PLACE ADVISOR improved the efficiency of conversations with subjects as it gained experience with them over time, and that this improvement was due to the system's update of user models rather than to subjects learning how to interact with the system. This conclusion is due to the significan differences between the user modeling and control groups, for both number of interactions and time per conversation. This significance holds even in the face of large error bars and a small sample size. This in turn implies that the differences are large and the system could make a substantial difference to users. The results for effectiveness were more ambiguous, with trends in the right direction but no significant differences between the modeling and control groups. Subjects in both conditions generally liked the system, but again we found no significant differences along this dimension. A larger study may be needed to determine whether such differences occur. Further user studies are warranted to investigate the source of the differences between the two groups. One plausible explanation is that items were presented sooner, on average, in the user modeling group than in the control group. We measured this value (i.e., the average number of interactions before the first item presentation) in the current study and found that it did decrease for the user modeling group (from 4.7 to 3.9) and increased for the control group (from 4.5 to 5.8). This is a reasonably large difference but the difference in slope for the two regression lines is not statistically significant (p=0.165). A larger study may be needed to obtain a significant difference. In general, however, there is an interaction between the user model and the order of questions asked, which in turn influences the number of items matching at each point in the conversation. This in turn determines how soon items are presented in a conversation. Therefore, if items are presented more often in the user modeling group, then the largest influence on the user model is due to item accepts and rejects.

**Figure 5:** Hit rate for modeling and control groups.
$\begin{figure} \setlength{\epsfxsize}{4.0in} \hskip 0.3in \epsfbox{figs/hit.ps} \end{figure}$

Next: Related Research Up: System Evaluation Previous: Experimental Results

Cindi Thompson
2004-03-29