Table of Contents


Voice Input vs. Keyboard Accelerators:
A User Study

Randy Pausch and James H. Leatherby
Computer Science Report No. TR-91-22
October 7, 1991
This work was supported in part by the National Science Foundation, the Science Applications International Corporation, the Virginia Engineering Foundation, the Virginia Center for Innovative Technology, and the United Cerebral Palsy Foundation.

Introduction

If voice input is to be widely used, user-interface designers need to know under what conditions voice input increases the productivity of common applications. Most previous work in the area of evaluating voice input has been in "versus studies," where voice input is raced against other input modes or devices. [Benbasat, Biermann, Bolt, Hauptmann, Leggett, Martin, Nye, Poock] While this is appropriate for applications where the user's hands and/or eyes are busy, these studies provide little information about the effectiveness of combining voice input with other input modalities. Our research goal is to measure how effective voice input can be when used in conjunction with other input devices/modalities.

In a previous user study [Pausch] we demonstrated that voice used in parallel with mouse input decreased task completion time for users of the popular Macintosh application MacDraw (Claris, version 1.9.6) by 21.23 percent. Our conjecture is that we reduced task completion time by reducing the amount of mouse motion required to access menu items.

Another way to reduce menu access time is to provide keyboard accelerators, sometimes called "hot keys" or "menu accelerator keys." In our previous study, we prohibited the use of keyboard accelerator keys. This paper presents a follow-up study where we test the hypothesis that voice input is faster than using keyboard accelerators. We performed a user study which showed that a "voice plus mouse" interface to MacDraw reduced task completion time by 21.23 percent, while keyboard accelerators reduced task completion time by 14.51 percent if the accelerators were memorized and by 9.92 percent if the accelerators were not memorized.

Description of the Study

Voice Input vs. Keyboard Accelerators: A User Study

Voice Input vs. Keyboard Accelerators: A User Study

Randy Pausch and James H. Leatherby
Computer Science Department
University of Virginia
Thornton Hall
Charlottesville, VA 22903-2442
(804) 982-2211
pausch@virginia.edu
Keyboard accelerators are presumably most productive for users who have already memorized the mapping of key strokes to application commands. We measured both cases: a "novice" group of users who had not memorized the keyboard commands, and an "advanced" group who had. The novices used a printed sheet listing the available keyboard accelerators. To invoke a command, they searched the printed sheet for the proper keyboard accelerator and then typed it on the keyboard. The advanced group memorized the seventeen keyboard accelerators before starting the experiment, which we confirmed by quizzing them. The "advanced" subjects were informed that if they forgot an accelerator, they could ask the experimenter for the key binding, although this did not happen during the study. We used an experimental group of sixteen subjects which we randomly divided into the novice and advanced groups. All subjects were graduate or undergraduate students at the University of Virginia; all were familiar with mouse usage and none were expert MacDraw users. No subject who took part in our previous experiment also participated in this study.

The subjects participated in two drawing sessions. In each session the subject first created a practice drawing and was then timed while creating four drawings. We used the same set of eight drawings from our previous study, so that we could compare the results. The drawings were chosen randomly from recent issues of Communications of the Association for Computing Machinery, Science, and the Journal of the American Institute of Chemical Engineers. We randomly selected drawings, instead of devising drawings specifically for the study, in order to avoid biasing the task. For each drawing, the subject started with a blank MacDraw screen and a printed copy of the artwork. The subject was allowed to study the artwork as long as desired before beginning the timed task.

The keyboard accelerators were constructed using Macro-Maker (Apple Computer Inc., version 1.0.2) for the Macintosh operating system (Apple Computer Inc., version 6.0.3). Following the standard Macintosh user-interface convention [Apple], all keyboard accelerators were invoked by holding down the "clover" key as a shift key, and then pressing a single keyboard key. The keyboard accelerators used in the study are shown in Table 1:

Some of the command names were modified from the earlier study in order to make them more mnemonic. In most cases, the letter used to activate the command is either the first letter of the command name or some letter that distinguishes the command from the others. The commands Cut, Paste, Select All, and Undo violate this convention, but were chosen to match the standard accelerators used by most Macintosh applications. Other commands that had accelerators provided by MacDraw were changed, if possible, to make them more mnemonic.

Results

Table 2 shows combined results from the earlier study and the current study. For six of the eight drawings, "voice plus mouse" input was faster than "voice plus accelerator key" input. We define "speedup with input method X" as

The average speedup per picture was 15.24 percent when the `advanced" group was compared to mouse input, and 12.88 percent when the "novice" group was compared to mouse input. This calculation ignores the fact that the individual pictures had a large variation in their complexity; by counting each picture's speedup equally in the average, we bias the result towards the simpler pictures. For example, a picture whose drawing time decreased from 20 seconds to 10 seconds would have a 50 percent reduction, and a picture whose drawing time decreased from 1000 seconds to 900 seconds would have a 10 per reduction. Computing a 30 percent average reduction for these two drawings is technically correct, but a better measure of time reduction is obtained by dividing the sum of the total raw time. In this example, dividing 910 by 1020 yields 90.2, or a 10.8 percent overall reduction in task time. When we perform this calculation, we find an overall time reduction of 14.51 percent when the "advanced" group is compared to mouse input only and 9.92 percent when the "novice" group is compared to mouse input.

There were two drawings for which voice input did not yield an increase when compared to one of the two groups. In both cases, there was a relatively large amount of text in the drawings, so the typing speed of the individuals became the dominant issue.

Discussion

We believe that keyboard accelerators take longer than voice input because they require a cognitive context switch. With keyboard accelerators, the user must perform a mapping from the command name to a key binding, whether or not he or she has memorized that binding. With voice, speaking the name of the desired command does not cause the user to perform a context switch.

For the keyboard accelerators in this study, the user needed to hold the "clover" key down while pressing another key. For most keys, this was accomplished with one hand while the user kept his or her other hand on the mouse. For some accelerators, the user needed to use both hands, which required homing to the keyboard and then back to the mouse. While shifting can be avoided by using dedicated function keys, shifted keys are the standard mechanism for the Macintosh, so we used them. We also used a very small number of accelerator keys in this study, we suspect that as the number of accelerator keys grow and the key-strokes become less obvious the savings that voice input provides will grow.

A final observation is that although we had expected memorization of the keyboard accelerators to be a large issue, it was not. The novice and advanced groups performed similarly. Although the command set contains nineteen distinct commands, only a small number of these were used frequently during the study. For the most part the novices learned these keys during the course of a single drawing.

Conclusions

Our previous study showed that when MacDraw was augmented with voice input, task completion time reduced by 21.23 percent. In that study, the control group used keyboard and mouse, but was prohibited from using accelerator keys. This study questioned whether the speedup achieved with voice (presumably by reducing mouse travel time to menus) could also have been achieved with accelerator keys. We found that the speedup obtained via "voice plus mouse" (21.23 percent) was greater than that of "accelerator keys plus mouse," which was 14.51 percent for advanced users and 9.92 percent for novices. On the basis of this evidence, we conclude voice input provides a significant reduction in task completion time for a graphical editor when compared to all the traditional alternatives.

References

Apple Computer Inc., Apple Inside Macintosh, Volume I, Addison Wesley Publishing Company, Inc., 1985, page 343.]
Benbasat, I., Dexter, A. S. and Masulis, P. S., An Experimental Study of the Human/Computer Interface, Communications of the Association for Computing Machinery 24, 11 (November 1981), pages 752 - 762.
Biermann, A., Rodman R., Rubin, D., and Heidlage, J., Natural Language with Discrete Speech as a Mode for Human-to-Machine Communication, Communications of the Association for Computing Machinery, 28, 28 (June 11985), pages 628 - 636.
Bolt, R., Put-That-There: Voice & Gesture at the Graphics Interface, Computer Graphics, 14, 3 (1980), pages 262 - 270.
Hauptmann, A. Speech and Gestures for Graphic Image Manipulation, Human Factors in Computer Systems (SIGCHI), 1989, pages 241 - 245.
Leggett, J. and Williams G., An Empirical Investigation of Voice as an Input Modality for Computer Programming, International Journal of Man-Machine Studies 21 (1984), pages 493 - 520.
Martin, G. F. The Utility of Speech Input in User-Computer Interfaces, International Journal of Man-Machine Studies, Academic Press Limited, 1989, pages 355 - 375.
Nye, J. M., Human Factors Analysis of Speech Recognition Systems, Speech Technology I pages 50 - 57.
Pausch, R. and Leatherby, J. H. "A Study Comparing Mouse-Only Input vs. Mouse-Plus-Voice Input for a Graphical Editor," Proceedings of the AVIOS `90 Voice I/O Systems Applications Conference, September 1990, pages 227 - 231.
Poock, G. K., Voice Recognition Boosts Command Terminal Throughput, Speech Technology 1, pages 36 - 39.