Human Computer Interaction Breakout Session
Breakout Chairman: Len Bass (ljb@sei.cmu.edu)
The HCI breakout group identified four interlocking areas of discussion:
- input devices
- output devices
- tasks for which wearables are used
- ergonomic issues
These four areas are interlocking because the task determines the setting for
the activities supported by the wearables and the setting determines the
ergonomic requirements. Also, the task determines which input and output
devices are possible. There is also coupling between the input and output
devices in the sense that certain input devices (such as a pen) dictate
particular output devices (such as a tablet).
We begin this report with an overall summary of our discussion, followed by a
detailed discussion of input and output devices. We end with several
recommendations that, if accepted, would improve the state of the wearable
computer from the HCI perspective.
1. Overall Summary
From an HCI perspective, the highest priority is to make the wearable systems
less obtrusive. That is, they need to be smaller and lighter weight and both
the input and output devices need to conform to peoples normal working
patterns. The need for smaller and lighter weight devices is well recognized
and is an overall goal for the hardware group. The need for less obtrusive
input and output devices is also recognized by the manufacturers of these
devices but there are difficult technical problems to be solved in these
cases.
The types of input and output devices that are most appropriate are driven by
the requirements of the particular task to be achieved by a wearable computer.
For example, if the computer is to be used to record information whose
possible content is not known in advance, then an input device that supports
arbitrary input is necessary such as a keyboard surrogate or a speech
recognition system. On the other hand, if the possible information content is
known (such as entering items on a checklist) then a more limited input device
such as several buttons or a dial could be acceptable.
Similarly for output, if the task requires displaying arbitrary textual
information then a limited resolution display or text to speech system would
be most appropriate. If the task requires arbitrary graphics then a higher
resolution visual output device is necessary.
All of this discussion is conditioned on the participants assumptions about
the tasks for which a wearable computer is being used. Some participants see
wearable computers replacing desktop computers and, hence, they would need to
be suitable for any application. Other participants see wearable computers
being very task specific and using different devices for different tasks
(the hammer and screwdriver analogy was used). In this case, the input and
output devices can be tailored and can be made much more specific for the
task.
Because at this point the correct combination of I/O devices for particular
tasks are not known, an important consideration for the HCI group is the
ability to configure devices to accept a variety of different input and output
devices. This is a hardware, software and human interaction issue. Its a
hardware issue because of the necessity for compatible connectors, its a
software and human interaction issue because the different devices may not
have the same functionality and support the same type of interactions.
We now recast the discussion we had with respect to particular input and
output devices.
2. Input Devices
The following types of input devices were mentioned as possible for some tasks
where a wearable computer would be desirable:
- speech recognizer
- keyboard alternative including chording keyboards and special purpose
keyboards
- mouse alternatives including trackballs, joypads, joysticks
- tab alternatives including buttons, dial
- eye trackers
- head trackers
- pen
- gesturing
- bar code reader
- other exotic devices such as skin sensors
Of these, we discussed speech and chording keyboards in some detail. That was
because these are general purpose input devices that are generally available.
Some of the other input devices - mouse and tab alternatives, for example, are
generally available but are limited purpose and their use would be conditional
on having tasks for which they are suitable. Others of the input devices - eye
and head trackers, for example, are not yet generally available. What we will
do for the remainder of this section on input devices is to enumerate the
positives and negatives for speech and for chording keyboards.
2.1 Speech
Speech, if perfected, would be an intuitively appealing input modality. Its
positives are that it allows for totally hands free input that can be couched
in a manner that is easy to learn. The user is in control of the interaction
and has a high branching function to take an interaction in a wide variety of
possible directions.
Since speech is such an appealing input modality, we will focus on its
negatives. We divide these into two categories: those intrinsic to speech and
those that are a function of our current speech recognition technology.
Conceptual problems
There are three conceptual problems with speech:
- determining what utterances are intended to direct the computer and
which are intended for a colleague or co-worker,
- prompting users who need assistance to recall the appropriate
responses in any particular situation, and
- specifying a position in a two dimensional space.
Of these, the first is potentially most troublesome. The two techniques for
determining the focus of attention of a speaking user are "press to talk" or
"bracket" words. Press to talk means having a special button to push that
indicates speech to the computer is about to begin. This simplifies the task
of the speech recognizer but it negates the hands free advantage of speech.
The user has to use a hand to push the talk button and now speech is on a par
with other input devices. Bracketing is the use of special words such as
"computer" to indicate the following utterance is directed at the computer.
The use of bracket words is somewhat unnatural to users and, in any case takes
some user training. That is, one solution removes the hands free advantage of
speech and the other diminishes the easy to learn advantages.
The fact that speech is not good at specifying a position in a two dimensional
space such as a map can be compensated by having multiple specialized input
devices. A gesturing or pointing input device for position specification and
speech for the remainder of the input. Changing modalities to accomplish a
particular input task may not be that easy for users to learn.
Problems of current technology
The other problems associated with speech recognition are not intrinsic and
can be expected to be removed as the technology improves. These are:
- quality of recognition/feedback requirements
- speed of recognition
- grammatical incorrectness
- speakers with various speech impediments
- ambient noise
- configuration restrictions (microphone/language model)
We now discuss these problems.
Quality of recognition
The quality of recognition is a function of the size of the vocabulary, the
acoustic characteristics of the environment and the microphone, and the
quality of the recognition algorithms. A recognition rate of 90% still means
that one word in ten is incorrectly recognized. Current speech systems
operating in ideal circumstances have a recognition rate of over 95% (one
error in twenty words).
Furthermore, some words acoustically are very close to each other and some
sounds are difficult to recognize. Words that begin with soft sounds such as
m, for example, are more difficult to recognize than words beginning with hard
sounds. This leads to vocabulary tuning (choosing a vocabulary of words that
are easily recognizable and acoustically distinct from each other).
Because of the error possibilities, some feedback mechanism must be given to
the user. Presenting the utterances textually on a screen, for example, or
repeating them through a headset. Thus, the use of speech dictates some type
of output device that might not otherwise be required. Furthermore, once the
user recognizes an error, there must be some mechanism for correcting the
error. If this mechanism is the use of an alternative input device, why bother
with speech? If this mechanism is to repeat the utterance, the user may get
very frustrated if the system does not recognize the utterance soon.
Speed of recognition
The speed of recognition is a function of the speed of the main processor on
the wearable computer. A software only solution seems to require a 125Mz
processor. Wearable computers are approaching this speed. The alternative is
to have some of the recognition done in a specialized processor packaged in a
PCMCIA card. In this case, there are additional possibilities for electronic
interference. Such cards are still in the beta stage and there is, as yet, no
large body of experience with them.
Grammatical incorrectness
Speech recognition systems are based on providing a grammar for the utterances
to be recognized. These systems are not very tolerant of incorrect grammar and
"uhs" and interjections. The users must be trained to speak in a fairly
constrained fashion. Again, this tends to negate the low training nominally
required of a speech system.
Speakers with various speech impediments
Speakers that have speech impediments such as prolonged stuttering, or
difficulty in speaking due to physical impairment are a population that speech
recognition systems seemingly would have a great deal of difficulty in
recognizing. None of the attendees at the workshop know of any research in
this area.
Ambient noise
Loud persistent or intermittently ambient noise may degrade speech recognition
systems. Filtering techniques exist to screen such noises and noise
suppression microphones exist but the results of speech recognition systems
are worse in such environments. The question is whether the degradation is
large or small.
Configuration restrictions (microphone/language model)
Speech recognition systems are tuned for a particular set of microphones and
for particular language models. Gender differences, for example, cause
different language models to be used. These restrictions do not prevent the
use of speech recognition but make it more expensive or more restrictive than
it might otherwise be.
2.2 Keyboard alternatives
A keyboard is attractive as an input device because it allows a full range of
textual input. A normal keyboard is unattractive in a wearable context because
of its size and cumbersomeness of use. The keyboard has to be worn somewhere
and then it has to be positioned for input. This conflict has given rise to
alternative keyboard devices. The Twiddler is a one handed "chorded" keyboard
that has been commercially available for quite some time. A chording keyboard
is one where combinations of keys are punched to indicate particular letters.
We identified the following considerations associated with chording keyboards.
Positives
On the positive side a chording keyboard uses only one hand for input and
requires no surface to mount it on (it can be held in the hand). It also has
reasonable speed (50 words per minute is achievable), is inexpensive, requires
low power, low bandwidth and is compatible will existing software.
Negatives
On the negative side the one handed requirement for input means that it could
not be used for applications where the user must have both hands totally free
at all times. There is a learning curve for the device and it is only suitable
for textual input. There is no pointing capability inherent in the device.
2.3 Pointing devices
Both of the input devices we have discussed thus far, speech and a chording
keyboard, have no ability to do pointing. As we alluded to in the speech
discussion, the ability to point to a position on a screen is important for
all direct manipulation interfaces and, more importantly for wearable use, for
all applications where there is a figure of interest or a map on the screen.
These devices can be either joystick, joypad, or touchpad together with one
or more selection buttons.
Positives
Pointing devices are intuitive, allow random access and positional input and
are compatible with desktop interfaces. They are widely available and could
provide a virtual keyboard by having a representation of a keyboard on the
screen and pointing to the various keys desired.
Negatives
The interfaces that currently utilize pointing devices are resource intensive.
They are inexact for precise coordinate specification and they are slow when
used to provide a virtual keyboard.
2.4 Other input devices
The other input devices enumerated above were not subject to a detailed
discussion and so we don't provide any discussion of them here. The sense of
the group was that these devices are less available, less mature, or less
useful than the ones that were discussed in some detail.
3. Output devices
The most appropriate output device to be used with wearable computers again
depends on the task to be performed but a wide variety of possible output
devices were mentioned by the attendees at the breakout. These were: head
mounted displays (HMDs), flat panels, text to speech, tactile output, non
speech auditory output, paper and olfactory output (scent). Of these we
discussed in more detail HMDs and flat panels.
3.1 Head Mounted Displays
Head mounted displays provide a visual output that can be used without
involving the hands. Thus, HMDs can be used in those tasks that require two
hands. Furthermore, in the future it will be possible to align the HMD output
with that of the real world and provide computer augmentation of reality.
Even without the augmented reality aspects of a HMD, they are visual output
devices of reasonable resolution (currently VGA), they are always accessible
and they can be totally private. HMDs can be "see through" or occluded. Those
that are see through could be read by other than the wearer although with some
difficulty but those that are occluded can not. Thus a HMD allows for private
output.
Some of the problems with HMDs are ergonomic and some have to do with their
capabilities. The ergonomic problems are headed by the resistance of some
users to wearing such an ungainly device. This problem may disappear over time
or may be alleviated if the user is performing an important task. In this
case, the wearing of a HMD may be seen as an emblem of importance. In any
case, there may be social resistance to the use of these devices.
Other ergonomic problems are the weight, comfort, glare and safety aspects of
the devices. Some devices are not effective in bright sunlight.
Cost is another problem. Currently VGA quality HMDs may cost $3000 or more.
The technical problems are lack of resolution (for some applications VGA may
not be sufficient resolution) and the fact that current displays are
monochrome. These technical problems will likely disappear in the next year or
two (although their disappearance may increase the cost of the devices).
3.2 Flat Panel Displays
Flat panel displays have the advantages of being relatively cheap, plentiful,
available in full color, sharable if there are multiple people who wish to
look at a display and useful for input.
On the other hand, they require the hands to hold them for output, a storage
place when they are not being held. Glare is a problem as is weight,
resolution and size.
4. Ergonomic considerations
The ergonomic considerations for using wearable computers partially have to do
with the task for which they are to be used and partially to do with the fact
they are wearable. The considerations just because the computers are wearable
are: size, weight, comfort, and cables. That is, they should be small
(although there is a minimum useful size for input devices), light weight,
useful in heat and cold and have a minimum number of cables. Cables are bad
because they get snagged on obstructions in the environment. The computers
should also be easy to get on and off.
Some of the task dependent issues have to do with the amount of mobility
required and the position of the user. The computers should be usable while
standing, moving, and lying down. The position of the computer on the body
should be variable depending on position of the user.
Safety is an issue when using HMDs as is sharing of output.
If the wearable computers are embedded in clothing then the size of the
clothing becomes important if the computers are to be used by multiple people.
If the computer is belt mounted then it can be used (sequentially) by multiple
people but it is relatively fixed as to where it can be worn.
5. Summary and recommendations
As can be seen from the above discussion, there are multiple possibilities for
both input and output devices and the capabilities and costs of these
possibilities are continually changing. The correct devices for any particular
application must be determined on the basis of the specifics of that
application. Unfortunately, there is not a lot of experience with different
applications and that experience is not widely shared. This is the basis for
our recommendations. We would like to see experimentation with the use of
wearable computers in different applications, we would like vendors to
facilitate that experimentation by having different input devices be
interchangeable (and output devices), and we would like to have a forum for
those interested in wearable computing to report the result of these
experiments.
5.1 Recommendation 1
There is a need for more information about the different types of applications
using wearable computers. Currently, several different organizations, most
notably Boeing, CMU and MIT are experimenting with wearables in different
application contexts. Boeing's experiments involve manufacturing and
maintenance, CMU's involve maintenance and MIT's involve wearable assisted
living. The first recommendation, then, is to encourage organizations to begin
experimenting with the use of wearable computers in different
applications.This is likely to begin happening naturally and so the need is
not only for the experimentation to occur but for the results of those
experiments to be reported. This is our second recommendation.
5.2 Recommendation 2
Have a forum for the presentation of wearable computer related results. Not
only experiments and experiences with the use of wearable computers in
different applications but also new techniques, new hardware and new software
can be presented in this forum. In any case, the forum should be public and
neutral. Some possibilities are to have a periodic wearable conference, to
have wearable elements identified in other conferences, to utilize electronic
means of communication or all of these plus others. There is a need, however,
for such a forum to enable us all to leverage each others results.
5.3 Recommendation 3
Manufacturers should have standards for input and output devices usable with
wearable computers that allow the interchange of comparable devices. Given the
large amount of experimentation that we see as necessary to determine the most
appropriate devices to use in particular tasks, we need to be able to replace
input or output devices without great difficulty. This suggests standard
connectors.