Posted: April 25th, 2025

HCI and UI – Discussion

> Explain the differences between various kinds of direct manipulation with respect to translational distances.

> Then, consider when and under what circumstances you would choose to use voice activated personal assistants, such as Siri, Cortana, and Google Talk, versus choosing to avoid it.

> Describe what roles voice/speech activated interactions/interfaces play in UI/UX, identifying at least 3 benefits and 3 limitations.

Need 2-3 pages with peer-reviewed citations. No introduction or conclusion needed.

CHAPTER

Direct Manipulation and
lmmersive Environments

•· Leibniz sought to make the form of a symbol reflect its content.
‘ In signs,’ he wrote, ‘one sees an advantage for discovery that

is greatest when they express the exact nature of a thing

briefly and, as it were, picture it; then, indeed, the labor of ”
thought is wonderfully diminished.’

CHAPTER OUTLINE
7. 1 Introduction

7.2 What Is Direct Manipulation?

7.3 Some Examples of Direct Manipulation

7 .4 2-D and 3-D Interfaces

7 .5 Teleoperation and Presence

7 .6 Augmented and Virtual Reality

Frederick Kreiling

“Leibniz,” Scientific American, May 1968

229

230 Chapter 7 Direct Manipulation and lmmersive Environments

7. 1 Introduction

Certain interactive systems generate a glowing enthusiasm among users that is
in marked contrast with the more common reaction of reluctant acceptance or
troubled confusion. The enthusiastic users report the following positive feelings:

• Mastery of the interface

• Competence in performing tasks

• Ease in learning originally and in assimilating advanced features

• Confidence in the capacity to retain mastery over time

• Enjoyment in using the interface

• Eagerness to show off the interface to novices

• Desire to explore more powerful aspects

These feelings convey an image of a trul y pleased user. The central ideas in
sucl1 satisfying interfaces, now widely referred to as direct-m.anipulation interfaces
(Shneiderman, 1983), are visibility of the objects and actions of interest; rapid,
reversible, incremental actions; and replacement of typed commands by a point­
ing action on the object of interest. Direct-manipulation ideas are at the heart of
many contemporary and advanced non-desktop interfaces. Game designers
continue to lead the way in creating visually compelling 3-D scenes with charac ­
ters (sometimes designed and user -created) controlled by novel pointing
devices. At the same time, interest in remote-operated (teleoperated) de, rices
has blossomed, enabling operators to look through distant microscopes or fly
drones. As the technology platforms mature, direct manipulation increasingly
influences designers of mobile devices and webpages. It also inspires designers
of information-visualization systems that present thousands of objects on the
screen wi th dynamic user controls (Chapter 16).

Newer concepts that extend direct manipulation include virtual reality, aug ­
mented reality, and other tangible and touchable user interfaces. Augmented
reality keeps users in the normal surroundings but adds a transparent overlay
with information su ch as the nam es of buildings or v isuali zations of nidden
objects. Tangibl e and touchable user interfaces give users physical objects to
manipulate so as to operate the interface-for examp le, putting several plastic
blocks near to each other to create an office floor plan. Virtual reality puts users
in an immersive environmen t in which the normal surroundings are blocked
out by a head-mounted display that presents an artificial world; hand gestures
allow users to point, select, grasp, and navigate. All of these concepts are being
applied not only in individual interactions but also in wider artificia l worlds,
creating collaborative efforts and other types of social-media interactions.

This chapter defines the principles, attributes, and problems of direct manipu­
lation, including a way to categorize direct manipu lation (Section 7.2). Some

7.2 What Is Direct Manipu lation? 231

examples of direct-manipulation use are provided in Section 7.3. Section 7.4
discusses 2-D and 3-D interfaces. Teleoperation and presence are covered in Sec­
tion 7.5. Lastly, augmented and virtual reality are discussed Section 7.6. Although
the tenets of direct manipulation still hold true, regardless of the sophistication of
the technology, the technology in this chapter is advancing rapidly. The refer­
ences for this chapter include a combination of books and articles. The articles are
taken from the recent conference proceedings showing some of tl1e innovations
and interesting projects being developed in research labs of industries and
academia. Many pundits and popular press sources (Kushner, 2014; Kofman, 2015;
Metz, 2015; Mims, 2015; Stein, 2015) feel the time for virtual and augmented
reality is now. Researchers are looking into the theoretical challenges and oppor­
tunities in virtual worlds (de Castell et al., 2012) and continuing to improve upon
the gaming experience (Kulshreshth and La Viola, 2015).

See also :

Chapter 10, Devices

Chapter 16, Data Visualization

7 .2 What Is Direct Manipulation?

Direct manipulation as a concept has been around since before computers. The
metaphor of direct manipu lation works well in computing environments and
was introduced in the early days of Xerox PARC and then widely disseminated
by Shneiderman (1983). Direct-manipulation designs can provide the capability
for differing populations and easily stretch across international boundaries. Sec­
tion 7.2.1 explains the three principles of direct manipulation and advantages of
using direct manipulation. Section 7.2.2 provides a way of discussing direct
manipulation using a translational concept of strength. Section 7.2.3 discusses
some problems with direct manipulation. Section 7.2.4 discusses the continuing
evolution of direct manipu lation.

A favorite example of direct manipulation is driving an automobile. The
scene is directly visible through the front window, and performance of actions
such as braking and steering has become common knowledge in our culture. To
turn left, for example, the driver simply rotates the steering wheel to the left.
The response is immediate and the scene changes, providing feedback to refine
the turn. Now imagine how difficult it would be trying to accurately tum a car
by typing a command or selecting “turn left 30 degrees” from a menu. The
graceful interaction in many applications is due to the increasingly elegant
app lication of direct manipu lation. Although there is lively discussion on the

232 Chapter 7 Direct Manipulation and lmmersive Environments

impact of driverless cars and their uses, research still continues. Driverless cars
may soon respond to commands like “take me to Baltimore airport,” but they
are a long \,vay from matching the skills of drivers at the whee l while navigating
snow-covered roads or police hand signals at accident sites.

Before designing for current devices, it makes sense to reflect where early
design has been. In the early days of office automation, there was no such thing
as a direct-manipulation word processor or a presentation system like Power­
Point. Word processors were comm.and-line-driven programs where the user
typically saw a single line at a time. Keyboard commands were used along with
inserting special commands to provide instructions for viewing and printing the
documents often as a separate operation. Similarly, \,vith presentation programs,
specialized commands were used to set the font style, color, and size. Obviously,
these were very limited compared to tl1e numerous font families available today.
Most users today are used to a WYSIWYG (What You See Is What You Get)
environment enhanced by direct-manipulation widgets.

7.2. l The three principles and attributes of direct manipulation

The attraction of direct manipulation is apparent in the enthusiasm of the users.
The designers of the examples, provided in Section 7.3, had an innovative inspi­
ration and an intuitive grasp of what users would want. Each examp le has prob­
lematic features, but they demonstrate the potent advantages of direct
manipulation, which can be summarized by three principles:

1. Continuous representations of the objects and actions of interest with
meaningful visual metaphors

2. Physical actions or presses of labeled interface objects (i.e., buttons) instead
of complex syntax

3. Rapid, incremental, reversible actions whose effects on the objects of interest
are visible immediately

Simple metaphors or analogies with a minimal set of concepts-for example,
pencils and paintbrushes in a drawing tool- are a good starting point. Mixing
metaphors from two sources may add complexity that contributes to confusion.
Also, the emotional tone of the metaphor should be inviting rather than distaste­
ful or inappropriate. Since the users are not guaranteed to share the designer’s
understanding of the metaphor, ai1alogy, or conceptual model used, ample test­
ing is required.

Using these three principles, it is possible to design systems that have these
beneficial attributes:

• Novices can learn basic functionality quickly, usually through a demonstra ­
tion by a more experienced user.

• Experts can work rapidly to carry out a wide range of tasks, even defining
new functions and features.

7.2 What Is Direct Manipu lation? 233

• Knowledgeable intermittent users can retain operational concepts.

• Error messages are rarely needed.

• Users can immediatel y see whether their actions are furthering their goals,
and if the actions are counterproductive, they can simply change the direc­
tion of their activity.

• Users experience less anxiety because the interface is comprehensible and be­
cause actions can be reversed easily.

• Users gain a sense of confidence and mastery because they are the initiators
of action, the y feel in control, and they can pr edict the inter face’s responses.

In contrast to textual descriptors, dealing with visual representations of
objects may be more “natural” and in line with innate human capabilities:
Action and visual skiJJs emerged well before language in human evo lution. Psy­
chologists have long known that people grasp spatial relationships and actions
more quickly when they are given visual rather than linguistic representations.
Furthermore, intuition and discovery are often promoted by suitable visual rep­
resentations of formal mathematical systems.

7.2.2 Translational distances with direct manipulation

The effectiveness and reality of the direct-manipulation interface are based on
the va lidity and strength of the metaphor chosen to repre sent the actions and
objects. Using familiar metaphors creates easier learning conditions for users
and lessens the number of mistakes and incorrect actions. Adequate testing is
needed to validate the metaphor. Special attention needs to be paid to the user
characteristics such as age, reading level, educational background , prior experi­
ences, and any physical disabilities.

One way of trying to understand and categorize the direct -manipulation
metaphor is by looking at the translational distance between users and the repre­
sentation of the metaphor, which will be referr ed to as strength. Strength can be
perceived along a continuum from weak to immersi ve (See Box 7.1). This can
be further described as the level of indirectness between the user’s physical
actions and the actions in the virtual space.

BOX 7. 1
Examples of trans lationa l dis tances (strength).

• Weak-early video game controllers (Fig. 7.5)

• Medium-touchscreens, multi-touch (Fig. 7 .1)

• Strong-data g love, gesturing, manipulating tangible objects (Fig. 7.2)

• lmmersive – virtual reality, i.e, oculus rift (Fig. 7.14)

234 Chapter 7 Direct Manipulation and lmmersive Environments

Weak direct manipu lation is what can be described as basic direct manipula­
tion. There is a mouse, trackpad, joystick, or similar device trans lating the user’s
physical action into action in the virtual space using some mapping function.
The translational difference is large because interaction is completely indirect.
For example, the user moves the mouse on a 2-D desk within a small circum­
scribed region and the mouse moves on the screen (again 2-D). Because this
mapping function is not always fully understood and processed correctly by the
user, sometimes the user will actually run the mouse off the surface of the desk.
Weak direct manipulation was used with early game controllers that provided
buttons and joysticks, where the action of the controllers needed to be learned
explicitly by the players.

Medium direct manipulation is the next step moving along the continuum.
The translational distance is reduced. Instead of communicating with the virtual
space with the device, the user reaches out and touches, moves, and grabs the
entities in the on-screen representation. Examples of this include touchscreens
(mobile, kiosk, and desktop). This is still limited by the glass of the screen; the
world is beyond the glass. This direct-manipulation strength supports pointing

flGURE 7.1
Three users working concurrently on a large tabletop touch device. They can use
their hands/figures to manipulate the objects on the device. Note the different hand
gestures being used. (www.reflectivethinking.com)

7.2 What Is Direct Manipulation? 235

and tapping, but other activities that would include the third dimension, like
reaching into the device, cannot be accommodated by the simple metaphor.
Instead, creating these other actions requires stepping outside the metaphor
with a new artifact such as double-tap and assigning a corresponding action to
it. This again requires learning on the user’s part. Multi-touch (Fig. 7.1) allows
new actions to be assigned to various combinations of finger touches. The two­
finger actions like zoom in/ out are intuitive, but others must be learned and
take longer to discover. This accounts for why a young child can easily learn to
tap, change screens, and touch on a tablet (the intuitive actions) but doesn’t have
the skills to rearrange the icons on the screen (the learned actions).

Strong direct manipulation involves actions such as gesture recognition with
various body parts. It may be the user’s hand, foot, head, or full body (whatever
controls the action) that is “virtually” placed inside the physical space (Fig. 7.2).
The users can see their hand in the 3-D space and can grasp, throw, drop,

F GURE 7.2
A tangible user interface for molecular biology, developed in Art Olson’s Laboratory
at the Scripps Research Institute, utilizes autofabricated molecular models tracked
with the Augmented Reality Toolkit from the University of Washington Human
Interface Techno logy Lab. The video camera on the laptop captures the mo lecule’s
position and orientation, enabling the molecu lar mode ling software to disp lay
information such as the attractive/repu lsive forces surrounding the mo lecule.

236 Chapter 7 Direct Manipulation and lmmersive Environments

manipu late, and so forth. The users themselves still remain on the outside
looking in. This works well when the spaces are small and simple, but when the
spaces get bigger, the users need to move themselves outside the initial metaphor
and en ter another mode, sucl1 as move mode, ai1d then traverse to the new
region. See Chapter 10 for more on devices.

The notion of tangible and immersive user interfaces-in which users grasp
physical objects to manipulate a graphica l display that represents the object are
becoming quite popular . Tai1gible devices use haptic interaction skills to maiup­
ulate objects and convert the physical form to a digital form (Ishii, 2008).

The last dimension is imn1ersive direct manipulation. Here is where direct
manipulation is combined with virtual reality (see Section 7.6). The users put on
glasses or some other device and they are inside the space. The users can see
them selves and can wa lk/ fly through tl1e space by walking, leaniI1g in, and so
forth – the scenery changes with the moves.

7.2.3 Problems with direct manipulation
Graphical user interfaces were a setback for vision -impaired users, who appre ­
ciated the simpl icity of linear command languages. However, screen readers for
interfaces, speech-enabled devices, page readers for browsers, and audio
designs for mobile devices enable vision -impaired users to understand some of
the spatial relationships necessary to achieve their goa ls.

Direct-manipulation designs may consume valuable screen space and thus
force va luabl e information off-screen, requiring scrolling or multiple actions.
This is an issue in the mobi le world, where screen space is very limited.

Another issue is that users must learn the meanings of visual representations and
graphlc icons. Titles that appear on icons (flyover help) when the cursor is over
them offer only a partia l solution. The visual representation may sometimes be mis­
leading. Users may grasp the analogical representation rapidly but then may draw
incorrect conclusions about permissible actions, overestimating or underestimating
the functions of the computer-based analogy. Ample testing must be carried out to
refine the displayed objects and actions and to minimize negative side effects.

For experienced typists, taking a hand off the keyboard to move a mouse or
point with a finger may take more time than typing the relevant command. This
problem is especially likely to occur if the users are familiar with a compact
notation, such as for arithmetic expressions, that is easy to enter from a key­
board but may be more difficult to select with a mouse. While direct manipula­
tion is often defined as replacing typing of commands with pointing with
devices, sometimes the keyboard is the most effective direct-manipulation
device. Rapid keyboard interaction can be extremely attractive for expert users,
but the visual feedb ack must be equally rapid and comprehensible.

Small mobile devices have limited screen sizes. A finger pointing at a device
may partially block the display, rendering a good portion of the device not

7.2 What Is Direct Manipu lation? 237

visib le. Also, if the icons are small because of the limited screen size, they may
be hard to select or, because of limited resolution and viewing capabilities
(especially for older adults), not clearly distinguishable, resulting in their
meanings becoming lost or confused.

Some direct-manipulation principles can be surprising ly difficult to realize in
software. Rapid and incremental actior\s have two strong implications: a fast
perception/ action loop (less than 100 ms) and reversibility (the undo action). A
standard database query may take a few seconds to perform, so implementing a
direct-manipulation interface on top of a database may require specia l program­
ming techniques. The undo action may be even harder to implement, as it
requires that each user action be recorded and that reverse actions be defined. It
changes the style of programming because a nonreversible action is imple­
mented by a simple function call whereas a reversible action requires recording
the inverse action.

7.2.4 The continuing evolution of direct manipulation

A successful direct-manipulation interface must present an app ropriate
representation or model of reality. With some applications, the jump to visual
language may be difficult, but after using vis ual direct-manipulation” interfaces,
most users and designers can hardly imagine why anyone would want to use a
complex syntactic notation to describe an essentially visual process. It is hard to
conceive of learning the commands for the vast number of features in modern
word processors, drawing programs, or spreadsheets, but the visua l cues, icons,
menus, and dialog boxes make it possible for even intermittent users to succeed.
See Box 7.2 for a summary of the advantages and disadvantages of direct
manipulation.

BOX 7.2
Advantages and disadvantages of direct manipu lation.

Direct Manipulation

Advantages

• Visually presents task concepts

• Allows easy learning

• Allows easy retention

• Allows errors to be avoided

• Encourages exploration

• Affords high subjective satisfaction

Disadvantages

• May be hard to program

• Accessibil ity requires special attention

238 Chapter 7 Direct Manipulation and lmmersive Environments

Users are trying to better understand all the data and other visua l conten t that
are now available. One way to manage this information is through the use of a
dashboard (Few, 2013). Being able to see a large volume of information (big data) at
one time and to directly manipulate it and observe the impact visually is a power­
ful concept. Businesses and companies are bombarded by volumes of data every
day. The ability to organize this user-generated data into a useful graphical for­
mat can help them manage resources and spot trends (Chapter 16). Dashboards
provide ways for users to manipulate data t1sing the var ious widgets provided.
Companies such as Tableau Software, SAP Lumira, and IBM Cognos provide this
capability as do smaller user-oriented companies like dashboardsbyexample.

Weiser’s (1991) influential vision of ubiquitous computing described a world
where computational devices were everywher e- in your hat1ds, on your body, in
your car, built into your home, and pervasively distributed in your environme11t.
The 1993 special issue of Conrrnunications of the ACM (Wellner et al., 1993) showed
provocative p rototypes that refined Weiser’s vision. It offered multiple visions of
beyond-the-desktop designs that used freehand gestures and small mobile devices
whose displays changed depending on wl1ere users stood and how tl1ey pointed
the devices. Almost 25 years later, Weiser’s full vision has not yet been realized,
but the social-media aspect of ubiquitous computing has blossomed.

Touchable displays from the small to ilie large (as large as wall size [Figs. 10.20
and 10.21] or even mall size) are becoming available as well . Interaction is all
accomplished without users entering a long string of commands; instead, users
physically manipulate the items of interest with their hands. An application of
this is often seen on news programs, where the commentator can move the
objects of interest on the screen and drill down to more detailed levels. Another
application is virtual maps, which can be manipulated and zoomed by using
hand motions as a multi -touch interface (Han, 2005). On a touchab le display,
interactions with both hands seem quite natural (although with small displays,
issues of occlusion can be problematic).

There will cer tainly be many future variations of and extensions to direct manip­
ulation, but the basic goals will remain similar: comprehensib le interfaces that
enable rapid learning, predictable and controllable actions, and appropriate feed­
back to confirm progress. Direct manipulation has the power to attract u sers
because it is rapid, and even enjoyable . If actions are simple, reversibility is ensured,
retention is easy, artXiety recedes, users feel in control, and satisfaction flows in.

7 .3 Some Examples of Direct Manipulation

No single interface has every admirable attribute or design feature-such an
interface might not be possible. Each of the examples discussed here, however,
has sufficient features to win the enthusiastic support of many users.

7.3 Some Examples of Direct Manipulation 239

7.3.1 Geographical systems including GPS (global positioning
systems)

For centuries, travelers have relied on maps and globes to better understand the
Earth and geographical systems. As graphic- and image -capture capabilities
increased (both real-world and human-generated), it was a natural progression
to create systems to represent both a current location- “where we are” – and a
target location – “where we want to go .” Of course, as prices dropped, these
types of systems became available as commercial GPS systems for cars, for walk­
ing, and even for the mobile phone. Being able to directly see the alternatives on
the devices as well as how to move from the current location to the target location
including martipulating the routes is another application of direct martipulation.

Google Maps ™, MapQuest, Google Street View, Garmin, National Geo­
graphic, and Google Earth TM combine geographic information from aerial pho­
tographs, satellite imagery, and other sources to create a vast database of
graphical information that can easily be viewed and displayed. In some areas,
the detail can go down to an individual house on a street or even inside a build­
ing (Fig. 7.3). With the well-populated databases of geographic points of interest,
these systems provide an easy-to-use facility to point and select the nearest gas
station or specific type of restaurant. Some systems provide real-time traffic to
facilitate alternative routing in traffic-laden situatior1s.

FIGURE 7 .3
This is a screenshot from Google Street View of the inside of the University Center
at Nova Southeastern University in Florida. On the bottom is a scrollable image of
other views on campus. In the bottom left corner is a more conventiona l static map
showing the physical street location of the campus. Users can move the “person” to
a different location on campus, and the views will change accord ingly.

240 Chapter 7 Direct Manipulation and lmmersive Environments

7.3.2 Video games
For many people, the most exciting, well-engineered, and commercially suc­
cessful application of the direct-manjpulation concepts lies in the world of video
games. The early but simp le and popular game Pong® (created in 1972) required
the user to rotate a knob that moved a white rectangle on the screen. A white
spot acted as a ping-pong ball that ricocheted off the wall and had to be hit back
by the movable white rectangle. Users developed speed and accuracy in placing
the “paddle” to keep the increasingly speedy ball from getting past, while the
computer speaker emitted a ponging sound when the ball bounced. Watching
someone else play for 30 seconds is all the training that a person needs to become
a competent novice, but many hours of practice are required to become a skil led
expert. The interface objects were a single paddle, a ball, a single player, and
some rudimentary sound. Games have come a long way v.rith various controls,
including full body, multiple objects of interest (both good and evil), full stereo
sounds, detailed graphical environments, changing backgrounds, and the pos­
sibility of multiple players sitting physically next to one another or virtually
across the globe.

Some cataloguers state tl1at we are in the eighth generation of video games.
Parkin (2014) provides an illustrated history of five decades of video games.
Last generation’s Nintendo Wii, Sony PlayStation 3, and Microsoft Xbox 360TM
have given way to this generation’s Nintendo Wii U, Sony PlayStation 4, and
Microsoft Xbox One in a very short time, and continued advances are expected.
These gaming platforms have brought powerful 3-D graphics hardware to the
home and have created a remarkable international market. Gaming experiences
are being enhanced by combining 3-D user-interface technologies, such as ste­
reoscopic 3-D, head tracking, and finger-count gestures (Kulshreshth and La Vi­
ola, 2015). For a detailed survey of visual, mixed, and augmented reality gaming,
refer to Thomas (2012).

Wildly successful games include violent first-person shoo ters, fast-paced rac­
ing games, and more sedate golfing games . Small handheld game devices still
exist, but now users are playing games on their phones and other mobile devices.
Multi-player games on the internet have also caught on with many users providing
the additional opportunity for social encounters and competitions. Gamjng
magazines and conferences attest to the widespread interest. In Rochester , New
York, part of the Museum of Play houses the International Center for History of
Electronic Games (http:/ /www.museumofplay.org/ic heg).

There is a wide genre of games, and the borders between genres are becom­
ing blurred. Some games are single-player games; others have multiple players.
For a list of gamjng genre acronyms, see Box 7.3. Players can be in the same
physical space or a different physical space but shared virtual space. Players
themselves can be virtual. For a more complete taxonomy of gaming sys tems,
see Pagulayan et al. (2012). In conducting research with player performance and

7.3 Some Examples of Direct Man ipulation 241

BOX 7 .3
Gaming genre acronym s.

The computer world is filled with a list of gaming genre acronyms. Some of the
more widely used acronyms include:

• AA -act ion adventure games

• ARPG – action role play games

• FPR-first-person shooter

• MMORPG-massively multi-player online role-p layi ng games

• MOBA-massive online battle arena

• RPG-role-playing games

• RTS- real-time shooter

experience, the games with multiple players seem to hold more interest in the
social connection with others, teamwork, and collaboration. The single-player
games seem to focus more on the game narrative and the charac ters, and players
show more interest in the degree of immersion (Johnson et al., 2015).

Game environments provide intriguing, successful app lications of 3-D repre­
sentations. These include first-person action games in which users patrol city
stree ts or race down castle corr idors while shoo ting at opponents as well as role­
playing fantasy games with beautifully illustrat ed island havens or mountain
strongho lds. Many games are socially enriched by allowing users to choose ava ­
tars to represent themselves. Users can choose avatars that resemble themselves,
but often they choose bizarre charact ers or fantasy images with desirable char­
acteristics such as unusual streng th or beauty (Boellstorff, 2008).

Some web -based game environments may involve millions of users and
thousands of user -constructed “worlds,” such as schools, shopping malls, or
urban neighborhoods. Game devotees may spend dozens of hours per week
immersed in their virtua l worlds, chatting w ith coJJaborators or negotiating
with opponents. World of Warcraft (developed and publi shed by Blizzard Enter­
tainment) has been the mainstay and most popular of the MMORPG games with
more than 5.6 million subscribers as of 2015 (Fig. 7.4). New games are constan tly
hitting the market, and the comp etition is fierce. A relatively new entry to the
mark et (2012), Guild Wars 2 (deve loped by Arena Net and published by NCsoft)
already has sold more than 5 million copies . This game is slightly different from
other MMORPG games because the game is responsive to individual player
actions, which is more common in sing le-player role-playing games.

The Nintendo Wii, introduced in 2006, changed the demographics of the gam­
ing wor ld. Instead of young children (typical ly boys), older adults were using the

242 Chapter 7 Direct M anip u lat ion and lmmersi ve Env i ronments

FIGURE 7.4
A woman playing World of Warcraft . She is using both her keyboa rd and mouse .
She also can hear the sou nds of the game via her headset .

Wii to play games like tennis and bowling . It also became an early fitness/well­
ness platform . With the introduction of the Kinect by Microsoft for Xbox in 2010
and then for Windows in 2012, more worlds opened up, and with a software
development kit (SDK), developers can create their own worlds. These interfaces
have been referred to as a natural user interface because the entire body can be
used, but the possible actions still remain limited and need to be learned. The
early Wii controller was modified with the addition of a wrist strap, since gamers
were so immersed in play they sometimes accidentally hurled the controller at
the screen. There is no syntax to remember, and therefore there are no syntax­
error messages . Error messages in general are rare because the results of actions
are obvious and can be reversed easily: If users move themselves too far to the
left, they merely use the natural inverse action of moving back to the right. These
principles, which have been shown to increase user satisfaction, could be applied
to other environments. Examples of various game controllers are shown in
Fig. 7.5. Customized controllers exist for games such as Guitar Hero (Fig. 1.8),
flight control (Fig. 10.9), and Leap Motion (Fig. 10.16).

F GURE 7.5

7.3 Some Examples of Direct Manipulation 243

·• • I I••• -. / ‘ , ‘♦ ‘ I I • –•



Various game controllers, Some are very specific and include a steering wheel or
joystick; others use a series of buttons and direction arrows. The Wii Contro ller with
t he wrist strap is show n in t he upper right corner. Although these game controllers
do provide direct-manipulation actions, users sti ll have to learn the meaning of t he
various bu ttons.

Most games continuously display a numeric score so that users can measure
their progress and compete with their previous performance, with friends, or
with the highest scorers. Typically, the 10 highest scorers get to store their ini­
tials in the game for pub lic display. This strategy provides one form of positive
reinforcement that encourages mastery. Studies with elementary-school chil­
dren have shown that continuous display of scores is extremely valuable.
Machine-generated feedback-such as “Very good” or “You’re doing great!”-

244 Chapter 7 Direct Manipulation and lmmersive Environments

is not as effective, since the same score carries different meanings for different
peop le. Most users prefer to make their own subjective judgments and perceive
the machine-generated messages as an annoyance and a deception. Providing
this combination of behavioral data and attitudina l data adds to the immersiorl
quality of the game (Pagulayan et al., 2012).

Although the marketing focus and consumer popularity have concentrated
on action-type games, there are other game environments, and gaming (or gam­
ification) has become a popular metaphor used in training and evalua tion. Sim­
ulation and educational games abound. Games have been developed for young
children (pre-readers) where the intuitiveness of the icons and real-world-type
interfaces (buttons, sliders, finger pointing, etc.) control the game. Females seem
more interested in role-playing games and games wi th narratives. A whole new
generation of female gamers now exists. Games are also used for wellness ben­
efits (Calvo and Peters, 2014; Jones et al., 2014). Researchers are trying to better
understand how users think and get into their flow state (Csikszentmihalyi, 1990;
Ossola, 2015). Gaming can be used to learn and enhance physical skills, modify
bellaviors, and increase we llness. Although there are some 11egative implica­
tions of gaming, McGonigal (2011) offers some rules for the positive impact of
gaming: limit yourself to no more than 21 hours a week; play games face to face
with friends and family; and play cooperative games or games that have a
creator mode.

Studying game design is fun (Lecky-Thompson, 2008), but there are limits
to the applicability of the lessons. Game players are engaged in competition
with the system or with other players, whereas app lications-sys tems users
prefer a stro ng internal locus of control, which gives them the sense of being in
charge. Likewise, whereas game players seek entertainment and focus on the
challenge, application users focus on their tasks and may resent too many
playful distractions. The random events that occur in most games are meant to
challenge the users; in non-game designs, however, predictable sys tem
behavior is preferred. Throughout this book, we discuss the u ser experience
(UX); the gaming world now designs for the player experience (PX). Research
is continuing with the development of a growing set of metrics to measure PX
Gohnson et al., 2015). Additional research is ongoing in the quantification and
eva luation of playfulness providing meaningful and memorable experiences
(Lucero et al., 2014).

Courses and majors (or minors) in video game design exist. Some are in
computer science departments, but others show the more interdisciplinary
nature of the subject and can be found in media design, visual communication,
and art departments. The important take-away is to use clear affordances,
good instructions, and informative feedback; limit the complexity; and be
aware of human variability (Fisher et al., 2014). All these are basic tenets
of HCI design as described in Section 3.3.4 (The Eight Golden Rules of
Interface Design).

7.3 Some Examples of Direct Manipulation 245

7.3.3 Computer-aided design and fabrication
Most computer-aided design (CAD) systems for automobiles, electronic cir­
cuj try, aircraft, or mechanical engineering use principles of direct manipulation.
Building and home architects now have at their disposal powerfu l tools, pro­
vided by companies such as Autodesk, that provide components to handle
structural engineering, floor plans, interiors, landscaping, plumbing, electrical
installation, and much more. With such applications, the designer may see a
circuit schematic on the screen and, with mouse clicks, be able to move compo­
nents into or out of the proposed circuit. When the design is complete, the com­
puter can provide information about current, voltage drops, and fabrication
costs and warnings about inconsistencies or manufacturing problems. Similarly,
newspaper-layout artists or automobile-body designers can easily try multiple
designs in minutes and can record promising approaches until they find even
better ones. The pleasure of using these systems stems from the capacity to
manipulate the object of interest directly and to generate multiple alternatives
rapidly.

There are large manufacturing companies using AutoCAD ® and similar
systems, but there are also other specialized design programs for kitchen and
bathroom layouts, landscaping plans, and other homeowner-type situations.
These programs allow users to control the angle of the st1n during the various
seasons to see the impact of the landscaping and shadows on various portions of
the house. They allow users to view a kitchen layout and calculate square foot­
age estimates for floors and countertops and even print out materials lists
directly from the software . Some of the players in the field of interior-design
software for residential and commercial markets include Floored, Inc. (Fig. 7.6),
2020 Spaces, and Home Designer Software. Their products are designed to work
across multiple environments, desktop to web; they provide various views
(top-down, architectural, front-view) to generate a more realistic overview of
the design for the client.

Related applications are for computer-aided manufacturing (CAM) and pro ­
cess control. Honeywel l’s Experian ® Process Knowledge System Orion provides
the manager of an oil refinery, paper mill, or power-utility plant with a colored
schematic view of the plant. The schematic may be displayed on multiple dis­
plays or on a large wall-sized map, with red lines indicating any sensor values
that are out of the normal range. With a single click, the operator can get a more
detailed view of the troubling component; with a second click, the operator can
examine individual sensors or can reset valves and circuits. A basic strategy for
this design is to eliminate the need for complex commands that the operator
might need to recall only during a once-a-year emergency. The visual overview
provided by the schematic facilitates problem solving by analogy because the
linkage between the screen representations and the plant’s temperatures or
pressures is so close. The latest version of this software provides capabilities for

246 Chapter 7 Direct Manipulation and lmmersive Environments

-,—-.—– / ‘
/

• –

FIGURE 7.6
An office space layout from a company called Floored, Inc. Th is 3·D virtual CAD
representation he lps designers lay out office space. Items can be moved around
between and within rooms; the design will be re-created to reflect any changes
(http ://www. fl oored .com) .

virtualization and cloud support and includes customized dashboards to show
status.

Another emerging use of direct manipulation involves home automation.
Since so much of home contro l invo lves floor plans, direct-manipulation actions
naturally take place on a display of the floor plan with selectable icons for each
status indicator (such as a burglar alarm, heat sensor, or smoke detector) and for
each activator (such as controls for opening and closing curtains or shades, for air
conditioning and heating, or for audio and video speakers or screens). For exam­
ple, users can route a recorded TV program being watched in the living room to
the bedroom and kitchen by merely dragging the on-screen icon into those
rooms, and they can adjust the volume by moving a marker on a linear scale. The
action is usually immediate and visible and can be easily reversed as well.

With the advent of these types of systems, not only are graphical, sophisti­
cated 3-D disp lays generated, but with 3-D printing technology, actual work­
able models can be generated. These models provide a more realistic view for
clients and customers. These models can include an overall outside view or ever1
be broken down to show component parts if necessary. The cost saving of these
models versus building the actual structure or device can be enormous coupled

7.3 Some Examples of Direct Manipulation 247

FIGURE 7.7
Astronaut Bruce Wilmore onboard the International Space Station with the ratchet
wrench that was created with Made in Space’s 3-D printer. This device was
designed, qualified, tested, and printed in space in less than one week.

with the ease for incremental or larger modification or changes. 3-D printers
have been installed on the NASA space station, where actual parts can be
fabricated (Fig. 7.7).

7.3.4 Direct-manipulation programming and configuration

Performing tasks by direct manipulation is not the only goal. It should be pos­
sible to do programming by direct manipulation as well, at least for certain
problems. How about moving a drill press or a surgical tool through a complex
series of motions that are then repeated exactly? Automobile seating positions
and mirror settings can be set as a group of preferences for a particular driver
and then adjusted as the driver settles in place. Likewise, some professional tele­
vision-camera supports allow the operator to program a sequence of pans or
zooms and then to replay it smoothly when required.

Programming of physical devices by direct manipu lation seems quite
natural, and an adequate visual representat ion of information may make direct­
manipulation programming possible in other domains. Spreadsheet packages
such as Excel™ have rich programming languages and allow users to create

248 Chapter 7 Direct Manipulation and lmmersive Environments

portions of programs by carrying out standard spreadsheet actions. The result
of the actions is stored in another part of the spreadsheet and can be edited,
printed, and stored in a textual form. Dat abase programs such as Access TM allow
users to create buttons that when activated will set off a series of actions and
commands and even generate a report. Similarly, Adobe Photoshop records a
history of user actions and then allows users to create programs with action
sequences and repetition using direct manipulation.

It would be helpful if tl,e computer could recognize repeated patterns reliably
and create useful macros automatically while the user was engaged in
performing a repetitive interface task. Most cellphones have buttons that can be
programmed to call home or call the doctor or another emergency number. This
allows the user to encounter a simpler interface and be shielded from the details
of tl,e tasks.

7 .4 2-D and 3-D Interfaces

Some designers dream about building interfaces that approach the richness of
3-D reality. They believe that the closer the interfaces are to the real world, the
easier usage will be. This extreme interpretation of direct manipulation is a
dubious proposition, since user studies show that disorienting navigation, com­
plex user actions, and annoying occlusions can slow performance in the real
world as well as in 3-D interfaces (Cockburn and McKenzie, 2002). Many inter­
faces (sometimes called 2-D interfaces) are designed to be simpler than the real
world by constraining movement, limiting interface actions, and ensuring visi­
bility of interface objects. Howe ver, the strong utility of “pure” 3-D interfaces
for medical, architectural, product design, and scien tific visua lization purposes
means that they remain an important challenge for interface designers. So the
power of 3-D interfaces lies in applying them in the appropriate domain or
context where the added dimension provides more understanding and improves
task outcomes.

An intriguing possibility is that “enhanced” interfaces may be better than 3-D
reality. Enhanced features might enab le outside of real human capabilities, such
as faster-than -light teleportation, flying through objects, multiple simultaneous
views of objects, and x-ray vision. Playful game designers and creative applica­
tions developers have already pushed the technology further than those who
seek merely to mimic reality.

For some computer-based tasks-such as medical imagery (Fig. 7.8), architec­
tural drawing, compute r-assisted design, chemical-structure modeling (Fig. 7.2),
and scientific simulations – pure 3-D representations are clearly helpful and
have become major industries. Howe ver, even in these cases, the successes
are often due to design features that make tlle interface better than reality.

7.4 2-D and 3-D Interfaces 249

FIGURE 7 .8
By using a medical simulation inserted into a large -scale visuali zation (using CAVE
tec hnology), physicians were able to find a solution that would not have been possible
with the actual surgery. (http://www.nsf.gov/news/news_summ.jsp?cntn _id=126209)

Users can magica lly change colors or shapes, duplicate objects, shrink/ expand
objects, group/ungroup components, send them by various electronic means,
and at tach floating labels. Users can go back in time and even undo recent actions .

Among the many innovations, there have been questionable 3-D prototypes,
such as for air-traffic control (showing altitude by perspective drawing only
adds clutter when compared to an overview from directly above), digital librar­
ies (showing books on shelves may be nice for browsing, but it inlub its search­
ing and linking), and file directories (showing tree structures in three dimensions
sometimes leads to designs that increase occlusion and navigation problems).
Other questionable applications include ill-considered 3-D features for situa­
tions in which simple 2-D representations would do the job. For example, add­
ing a third dimension to bar charts may slow users and mislead them (Hicks et
al., 2003), but they are such an attraction for some users that they are included in
most business graphics packages (Cognos, SAS/GRAPH, SPSS/SigmaPlot).

A modest use of 3-D techniques is to add highlights to 2-D interfaces, such as
buttons that appear to be raised or depressed, wirldows tha t overlap and leave
shadows, or icons that resemb le real-world objects. These may be enjoyable,
recognizable, and memorable because of improved use of spatial memory, but

250 Chapter 7 Direct M anip ulat ion and lmmersive Environments

they can also be visually distracting and confusing because of additional visual
complexity.

This enumeration of features for effective 3-D interfaces might serve as a
checklis t for designers, researchers, and educators:

• Use occlusion, shadows, perspective, and other 3-D techniques carefully.

• Minimize the number of navigation steps required for users to accomplish
their tasks.

• Keep text readable (better rendering, good contrast with background, and no
more than 30-degree tilt).

• A void unnecessary visual clutter, distraction, contrast shifts, and reflections.

• Simplify user movement (keep movements planar, avoid surprises like going
through wa lls).

• Prevent errors (that is, crea te surgical tools that cut only where needed and
chemistry kits that prod uce only realistic molecules and safe compounds).

• Simplify object movement (facilitate docking, follow predic table paths, limit
rotation).

• Organize groups of items in aligned structures to allow rapid visua l search.

• Enable users to construct visual groups to support spatial recall (placing
items in corners or tinte d areas) .

Breakthroughs based on clever ideas seem possible. Enriching interfaces with
stereo displays, hap tic feedback, and 3-D sound may yet prove beneficial in
more than specialized applications. Bigger payoffs are more likely to come
sooner if these guidelines for inclusion of enhanced 3-D features are followed :

• Provide overviews so users can see the big picture (plan view disp lay,
aggregated views).

• Allow teleportation (rapid context shifts by selecting destination in an
overview).

• Offer x-ray vision so users can see into or beyond objects.

• Provide history keeping (recording, undoing, replaying, editing) .

• Permit rich user actions on objects (save, copy, annotate, share, send).

• Enable remote collaboration (synchronous, asynchronous).

• Give users control over explanatory text (pop -up , floating, or excentric labels
and screen tips) and let them view details on demand.

• Offer tools to select, mark, and measure.

• Implement dynamic queries to rapidly filter out unneeded items.

• Support semantic zooming and movement (simple action brings object front
and center and reveals more details).

7.5 Teleoperation and Presence 251

• Enable landmarks to show themselves even at a distance.

• Allow multiple coordinated views (users can be in more than one place at a
time and see data in more than one arrangement at a time).

• Develop novel 3-D icons to represent concepts that are more recognizable and
memorable.

3-D environments are greatly appreciated by some users and are helpful for
some tasks (Laha et al., 2012). They have the potential for novel social, scientific,
and commercial applications if designers go beyond the goal of mimicking 3-D
reality. Enhanced 3-D interfaces could be the key to making some kinds of 3-D
teleconferencing, collaboration, teleoperation, and telepresence popular. Of
course, it will take good design of 3-D interfaces (pure, constrained, or enhanced)
and more research on finding the payoffs beyond the entertaining features that
appeal to first-time users. Success will come to designers who provide compel­
ling content, relevant features, appropriate entertainment, and novel social­
media structure support. By studying user performance and measuring
satisfaction, those designers will be able to polish their designs and refine guide­
lines for others to follow.

7 .S Teleoperation and Presence

Teleoperation has two parents: direct manipulation in personal computers and
process control, where human operators control physical processes in complex
environments . Typical tasks are operating power or chemical plants, controlling
manufacturing, surgery, flying airplanes or drones, or steering vehicles. If the
physical processes take p lace in a remote location, we talk about teleoperation or
reniote control. To perform the control task remot ely, the human operator may
interact with a computer, which may carry out some of the control tasks without
any interference by the human operator.

There are great opportunities for the remote control or teleoperation of
devices if acceptable user interfaces can be constructed. When designers can
provide adequate feedback in sufficient time to permit effective decision mak­
ing, attractive applications in manufacturing, medicine, military operations, and
computer-supported collaborative work are viable. Home-automation applica­
tions extend remote operation of various devices to security and access systems,
energy control, and operation of appliances. Scientific applications in space,
underwater, or in hosti le environments enable new research projects to be con­
ducted economically and safely. The recent introduction of affordable drones
will be yet another facet of teleoperation.

252 Chapter 7 Direct Manipulation and lmmersive Environments

In traditional direct-manipulation interfaces, the objects and actions of inter­
est are shown continuously; users genera lly point, click, or drag rather than
type, and feedback indicating change is immediate. However, when the devices
being opera ted are remote, these goals may not be realizable, and designers
must expend additional effort to help users to cope with slower responses,
incomplete feedback, increased likelihood of breakdowns, and more complex
error-recovery procedures. The problems are strongly connected to the
hardware, physical environmen t, r1etwork design, and task domain.

A typical remote app lication is telem.edicine, or medical care delivered over
communication links (Sonnenwald et al., 2014). Telemedicine can be used more
broadly to allow physicians to examine patients remotely and surgeons to
carry out operations across continen ts. Telehealth is being wide ly used in the
Veteran’s Administration (Fig. 7.9).

Veterans can come into the local VA office where technology visits with the
various medical personnel can be conducted via Telehealtll. Cameras with

. ‘
9• I

FIGURE 7 9
Erica Taylor, Nurse Director for t he Telehealth Program at Landstuhl Regional
Medical Center, demonstrates using the Telehea lth cart otoscope to conduct a
real-time tympanic membrane exam. On the screen is Physician Assistant Steven
Cain, who from a remote location can see and evaluate the patien t and provide an
appropriate plan of care. Photo by Phil Jones.

7.5 Teleoperation and Presence 253

FIGURE 7.10
When doing robotic su rgery, the surgeon sits at the computer console and contro ls
t he robotic came ra and surgical inst ruments remotely. Various devices on the
contro ller can be adjusted by the surgeon including adj ustments/magnifiers to
clearly see the f ield of view.

high-resolution images can allow the doctor to see the physical condition as well
as the added benefit of seeing the affect of the patient. A trained medical person
can be in the office with the patient to help facilitate the examination. Other
medical applications include robotic surgery. Robotic surgery is an alternative to
conventional surgery that enables a smaller incision and more accurate and pre­
cise surgical movements. The robotic platform expands the surgeon’s capabili­
ties and provides a highly magnified 3-D image (Fig. 7.10). In addition, the
surgeon has control over hand, wrist, and finger movement through robotic
instrument arms. The surgeon is comfortably seated across the operating room
at a console rather than being over the patient, and the system damps out some
involuntary movements that can be problematic.

The architecture of remote environments introduces several complicating
factors:

• Time delays. The network hardware and software cause delays in sending user
actions and receiving feedback: a transmission delay, or the time it takes for the
command to reach the microscope (in our example, transmitting the command
over the network), and an operation delay, or the time unti l the microscope

254 Chapter 7 Direct Manipulation and lmmersive Environments

responds. These delays in the system prevent the operator from knowing the
current status of the system.

• Incomplete feedback. Devices originally designed for direct control may not
have adequate sensors or status indicators. For instance, the microscope can
transmit its curr ent position, but it opera tes so slowly that it does not indicate
the exact current position.

• Unanticipated interferences. Since the operated devices are remote,
unanticipated interferences are more likely to occur than with physically
pre sent direct-manipulation environments. For instance, if a local operator
accidentally moves the slide under the mjcroscope, th e positions indicated
might not be correct. A breakdown might also occur during the execution of
a remote operation without a good indication of this event being sent to the
remote site.

One solution to these problems is to make explicit the network delays and
breakdowns as part of the system. The user sees a model of the starting state of
the system, the action that has been initiated, and the current state of the
system as it carries out the action. It may be preferable for users to specify a
destination (rather than a motion) and wait until the action is comp leted before
readjusting the destination if necessary. Avenues for continuous feedback also
are important.

Teleoperation is also commonly used by the military and by civilian space
projects. Military applications for unmanned aircraft gained visibility during
the recent wars in Afghanistan and Iraq. Reconnaissance drones and teleoper­
ated missile -firing aircraft were widely used. Agile and flexible mobile robots
exist for many hazardous duty situations (Murphy, 2014). Military missions and
harsh environments, such as undersea and space exploration, are strong drivers
for improved designs.

Telepresence was initially defined by Marvin Minsky (1980), but today the
operative term is presence. The concept was that of not being remote but giv­
ing the feeling of “being there.” Advances are being made with telepresence,
and toda y’s technologie s and the inter11et-connected world have opened up
additional possibilities. The commercial market is seeing a set of technologies
called mobile remote presence (MRP) systems (Fig. 7.11). These are advanc­
ing video conferencing sys tems and allowing remote workers to have a feel­
ing of presence. These devices facilitate formal communica tions as well as
more informal chats in hallways. Some of the companies creating these
devices include Suitable Technologies Beam, Mantarobot, Doublerobotics,
and VGO. The controlling of these dev ices is another application of direct
manipulation. Another application that extends the idea of video conferenc­
ing, made popular by Skype and other techno logies, is a shared work space
called ImmerseBoard , where the users are co-located but can work on the
same screen (Fig. 7.12).

7.5 Teleoperation and Presence 255

FIGURE 7.11
Three peop le having a conversation in a work environment , two are participating
using MRP devices .

FIGURE7.12
lmme rseBoard allows two users to be co-located and work on the same shared
screen (Higuchi et al., 2015).

256 Chapter 7 Direct Manipulation and lmmersive Environments

Robotics is a subfie ld of telepresence. Robots are being used in medical
settings, office settings, education, and other specialized applications. New
usage norms are being established for these types of devices and interactions
(Lee and Takayama, 2011). The remote coworkers are often referred to as
pilots. They can wander the hallways or “just hang out.” Frameworks are
being created with various design dimensions to better understand presence
(Rae et al., 2015). It is important to understand the perspective of the users and
especially that of the remote user. Doing various tasks with remote users in
this type of set up can increase cognitive load. The remote person needs to
concentrate on the task at hand as well as operating and positioning the device
properly (Rae et al., 2014). Kristof fersson et al. (2013) provide an in-depth
review of mobile robotic presence. Future work needs to be done on how
mobility affects remote collaboration and on better understanding the desigrl
of mobility features. For individuals with limited mobility, robotics can facili­
tate more active p articipation. A full discussion of robotics and HCI is beyond
the scope of this book.

7 .6 Augmented and Virtual Reality

Flight-simulator designers work hard to create the most realistic experience for
fighter and airline pilots. The cockpit displays and controls are taken from the
same production line that creates the real ones . Then the windows are replaced
by high-reso lution computer displays, and sounds are choreographed to give
the impression of engine start or reverse thrust. Finally, the vibration and tilting
during climbing or turning are created by hydraulic jacks and intricate suspen­
sion systems. This elaborate technology may cost $100 million, but even so, it is
a lot cheaper, safer, and more useful for training than the $400-million jet that it
simulates. (And for training actual pilots, the reasonable flight simulators that
millions of home computer game players have purchased won’t quite do the
trick!) Flying a plane is a complicated and speciali zed skill, but simulators are
available for more common-and some surprising- tasks under the alluring
name of virtual reality or the more descriptive virtual environ1nents.

The gurus of virtua lity are promoting immersive experiences. The
miniaturization of electronics has provided less bulky gear to do that exploring.
As compu ter sys tems continue to run faster, the obstacles that were in the way
of immersive experiences are disappearing and the technology is becoming
more affordable. Head-mounted displays are available from various
manufacturers: Oculus Rift, Razer OSVR, HTC Vive, Sensics, Sony Glasses, and

7.6 Augmented and Virtual Reality 257

FIGURE 7.13
Image-guided surgery can be done with the surgeon’s hand attached to multiple
sensors that can mimic the hand and finger positions and create accurate control. In
the past, gloves were often used to att ach t he sensors and did not offer the flexibility
and accuracy of the directly attached sensors. (http://po lhemus.com/micro-sensors)

Polhemus. Bulky gloves are being replaced by more-lightweight materials (Fig.
7.13) and less-cumbersome connections (Fig. 7.14). Companies are advancing
this technology very quickly. Magic Leap has just applied for a patent for a con­
tact lens to facilitate augmented or virtual reality (Kokalitcheva, 2015).

The direct -manipulation principles out lined in Section 7.2.1 may be helpfu l to
people who are designing and refining virtual and augmented reality
environments. When users can select actions rapidly by pointing or gesturing

FIGURE7.J4
Oculus Rift head gear. This is an example of a virtual reality head-mounted display .

258 Chapter 7 Direct Manipulation and lmmersive Environments

——– Mixed Reality ——-~

Real
Environment

FIGURE7.15

Augmented
Reality

Augmented
Virtuality

Reality-Virtua lity (RV) Continuum

Virtual
Environment

This figure shows the reality -virtuality continuum initia lly sketched by Milgram and
Kishi no in 1994. It still holds t rue today. Mixed rea lity is the reality that has some
aspects of augmented reality within a virtua l environment.

and display feedback occurs immediately, users have a strong sense of causal­
ity. Interface objects and actions should be simple so that users view and manip­
ulate task-domain objects.

Graphics researchers have been perfecting image displays to simulate light­
ing effects, textured surfaces, reflections, and shadows. Data structures and
algorithms for zooming in or panning across an object rapidly and smoothly are
now practical on common computers and even some mobile devices. The
immersive environment has some problems, including simulator sickness, nau­
sea, and discomfort from wearing head-mounted gear and other equipment.
Some of these problems are minimized by less-jumpy graphic transitions. Better
understanding of the usability challenges, such as how much reality should be
incorporated and when and how it can improve the user experience, is needed
(McGill et al., 2015).

As our systems become more soph istica ted, the distinction between differ­
ent leve ls of virtual ity blurs. It is best portrayed as originally conceived by
Milgram and Kishino (1994): a continuum (Fig. 7.15). The last two sections of
this chapter discuss augmented reality (Section 7.6.1) and then virtual reality
(Section 7.6.2).

7.6.1 Augmented reality

Augmented reality enables users to see the real world with an overlay of addi ­
tional information; for example, while users are looking at the walls of a build­
ing, their semitransparent eyeglasses may show the location of electrical wires
and studwork . Medical applications, such as allowing surg eons or their assis­
tants to look at patient while they see an over lay of a sonogram or other perti ­
nent information to help locate a tumor, also seem compelling (Fig. 7.16).
Augmented reality could show users how to repair equipment or guide visitors
through cities (Fig. 7.17). Augmented reality strategies also enab le users to

7.6 Augmented and Virtual Reality 259

FGURE7.16

/
Incisio n
24 ,.,

Virtual real ity might be used to help surgeons or their assistants during surgery,
by showing perti nent information superimposed on a view of the real world.
( http://a ug menta ri um.um iacs. u md.edu)

I

manipulate real-world artifacts to see results on graphical models (Poupyrev et
al., 2002; Ishii, 2008) with applications such as manipulating protein molecules to
understand the attractive/ repul sive force fields between them. Using augmented
reality systems to enhance social pretend play by young children (ages 4-6)
promotes reasoning abou t emo tional states as well as communication and diver­
gent thinking (Bai et al., 2015).

An interior designer walking through a house with a client should be able to
pick up a window-stretching tool or pull on a handle to try out a larger window
or to use a room-painting tool to change the wall colors while leaving the win­
dows and furniture untouched. Companies like IKEA are providing augmented
reality tools so customers can visua lize the products via their catalog in their
own homes and rooms (Fig. 7.18).

7.6.2 Virtual reality
The presence aspec t of vir tual reality breaks the physical limitati ons of space
and allows users to act as though they are somewhere else . Practical think ers
immediate ly grasp the connection to remote direct manipulation, remote
control, and remote vision, but the fantasists see the potential to escape current

260 Chapter 7 Direct Manipulation and lmmersive Environments

Prentiss Brown Thea …

** * 0.44mi

‘.’· –
\

‘: , . . i,I • … .
Banana Republ
**** IC

~ ti -·­,., ~

Olrl Town
0.04mi

FIGURE 7.17

0.02mi

GapBody

***

1.-\
: ‘ ..

~ . ~ ~~·> –

… – –

0.04m i , ·

. -. -. “‘I . . • . I r ~: ,
:””l’l •

– … . ‘ -. l’.,. ·. , :

4■1

0.02mi

·=· •••• •• •• • •• • • •

Steamer’s Grill House

** ** 0.04m i
\~ ?’! , …… -~ …. , . ~ · ….. ~ – . ‘ . ~

Wine Cellar

* * * *; O.OSmi

Using augmented reality overlays, the HERE City Lens app shows various points of
interest on a mobile phone . Icons represent the types of places (food, shopping, etc.)
and distances from the current location. In addition, links are provided to user reviews .

FIGURE 7.18
Customers can use their persona l mobi le devices to pul l up objects from the IKEA
catalog and see how the various items would look in their own house .

7.6 Augmented and Virtual Reality 261

reality and to visit science-fiction worlds, cartoonlands, previous times in
history, galaxies with different laws of physics, or unexplored emotional
territories.

Tltere have been many medical successes using virtual environments. For
example, virtual worlds can be used to treat patients with a fear of heights by
giving them an immersive experience with control over their viewpoint and
movement. The safe immersive environment enables phobia sufferers to
accommodate themselves to frightening stimuli in preparation for similar
experiences in the real world. Another dramatic result is that immersive envi­
ronments provide distractions for patients so that some forms of pain are con­
trolled (Fig. 7.19). The immersive virtual reality environment has been used to
treat military personnel suffering with PTSD (Fig. 7.20). Virtual worlds can be
used for positive computing (Calvo and Peters, 2014) and wellness issues
(Fig. 7.21).

FIGURE 7.19
A patient using UW HITLab/Harborvview’s SnowWorld pain distraction at Shriners
Children’s Burn Center Galveston. UW designer/researcher Hunter Hoffman’s latest
version of SnowWorld was created for the UW by gifted worldbui lders at www
.firsthand.com using www.3ds.com Virtual World Deve lopment Software. The
immersive experience seems to lessen the painfu l experiences.

262 Chapter 7 Direct Manipulation and lmmersive Environments

FIGURE 7 . 20
Soldiers can “re•live” portions of their combat experiences in a virtual reality setting
with full immersion and sounds. Some systems even provide full immersion to
include shaking and movement to make the experience as realistic as possible.
Working with trained therapists, the soldier can be slowly desensitized from the
traumatic experiences. (http://ict.usc.edu)

The opportunities for artistic expression and public-space installations are
being explored by performance artists, museum designers, and building
architects. Creative installations include projected images, 3-D sound, and
sculptural components, sometimes combined with video cameras and user
control by mobile devices. Other creative ideas include virtual dressing rooms
where users can try on clothes on a model of themselves. The possibilities are
truly endless.

Further information on virtual and augmented reality can be found in the
wide assortment of textbooks available (Fuchs et al., 2011; Boellstorff et al., 2012;
Kipper and Rampolla, 2012; Craig, 2013; Hale and Stanney, 2014; Barfield, 2015,
Jerald, 2016). Billinghurst et al. (2014) recently compiled a comprehensive sur ­
vey of augmented reality that gives both history of the field and details about
the technologies and tools, including future research directions. The field is

Practitioner’s Summary 263

FIGURE 7.21
Image of a virtual meditative wor ld for engaging in meditat ion act ivi t ies. The virtual
world has sounds tha t change with each chakra (stage) of the meditation process.
This is an application of positive computing. (http://nsuworks.nova.edu/gscis_etd/65/)

changing rapidly, and although avatars and virtual worlds still exist and are
being explored (Blascovich and Bailenson, 2011), other virtual worlds like
Second Life have almost disappeared (Boellstorff, 2008).

Practitioner’s Summary

Among interactive systems that provide equivalent functionality and reliabil­
ity, some systems have emerged to dominate the competition. Often, the most
appealing systems have an enjoyable user interface with customized user ­
generated content that offers a natural representation of the task objects and
actions-hence the term direct manipulation (Box 7.2). These interfaces are easy to
learn, to use, and to retain over time. Novices can acquire a simp le st1bset of the
actions and then progress to more elaborate ones. Actions are rapid, incremen­
tal, and reversible, and they can be performed with physical movements instead
of complex syntactic forms. The results of actions are visible immediately, and
error messages are needed less often.

264 Chapter 7 Direct Manipulation and lmmersive Environments

Using direct-manipulation principles in an interface does not ensure its
success. A poor design, slow implementation, or inadequate functionality can
undermine acceptance. For some applications, other approaches may be more
appropriate. However, great potential exists for multiple and varied applica­
tions of direct-manipulation concepts. Compelling demonstrations of virtual
and augmented reality are being applied in a growing set of app lications with
enhanced social interactions. Iterative design (Chapter 4) is especially important
in testing advanced direct-manipulation sys tem s because the novelty of these
approaches may lead to unexpected problems for designers and users.

Researcher’s Agenda

Research needs to refine our understanding of the contributions of each feattue
of direct mai1ipulation: analogical representation, incremental action, reversibil­
ity, physical action instead of syntax, immediate visibili ty of results, characteris ­
tics such as translational distances, and graphic displays. Reversibility is easily
accomplished by a generic undo action, but designing natural inverses for each
action may be more attractive. Complex actions are well-represented with direct
manipu lation, but multi-la yer design strategies for graceful evolution from
novice to expert usage could be a major contribution. For expert users, direct ­
manipulation programming is still an opportuni ty, but good methods of history
keeping and edi ting of action sequences are needed as well as increased atten­
tion to user-generated content. Better understanding of touchable interfaces and
their uses as well as research on two-handed versus one-handed operations are
needed. The allure of 3-D interaction is great, but researchers need to provide a
better understanding of ho,.v and when (and when not) to use features such as
occlusion, reduced navigation, and enhanced 3-D actions such as telepor tation
or x-ray vision and what are the best widths for field of view. Providing bet ­
ter semantic understanding of 3-D images can provide information for visually
impaired users to better understand their environment. The impact of immer­
sion on gaming and virtual worlds using rich socia l-media interactions across
various ages and activities needs to be understood better.

Beyond the desktops and laptops, there is the allure of presence, virtual
environments, augmented realities, and context-aware devices. Research is
needed into how presence affects bel1aviors and interactions including privacy
issues. The playful and enjoyable aspects will certainly be pursued, but the real
challenge is to find the practical designs and a better understanding of “being
there” when looking at 3-D worlds, both as individuals and as collaborators and
players in tl1e enriched social-media environments. A new set of tools is needed
to investigate and better understand digital games research and its imp lications,
both good and bad.

Discussion Questions 265

WORLD WIDE WEB RESOURCES

www. pearsonglobaleditions .com1shneiderman

Other Resources

Journals

• Presence (teleoperators and virtual environments):
http://www.mitpressjournals.org1loilpres

• Virtual Reality-Springer:

http-J/www.springer.com/computer/image+processing/journal/10055

• International Journal of Virtual Reality. http://www.ijvr.org/
• International Journal of Virtual Technology and Multimedia:

http://www.inderscience.com/jhome.php?jcode=ijvtm

Confere nces

• VRST ACM Symposium on Virtual Reality Software and Technology:
http://vrlab.buaa.edu .cn/vrst2015/

• IEEE Virtual Reality: http://ieeevr.org/2016/
• IEEE Symposium of Mixed and Augmented Reality, IEEE and ACM Symposium on

Augmented Reality: http-J/ismar.vgtc.org

Additional information of this topic can be found in multimedia jou rnals and
conferences as well as journals and conferences that emphasize visualization.

Discussion Questions

1. Describe three principles of direct manipulation.

2. Give four benefits of direct manipulation. Also list four problems of direct
manipulation.

3. Explain the differences between various kinds of direct manipulation with
respect to translational distances.

4. An airline company is designing a new online reservation system. They want
to add some direct-manipulation features. For example, they would like cus­
tomers to click a map to specify the departure cities and the destinations, and
to click on the calendar to indicate their schedules. From your point of view,
list four benefits and four problems of the new idea compared witll their old
system, which required the customer to do the job by typing text.

266 Chapter 7 Direct M anip ulat ion and lmmersi ve Env i ronments

5. Explain how virtual reality can be used for medical purposes .

6. List an example of teleoperation or virtual reality. Consider what a future
application (that does not present ly exist) might do. Be creative!

References

Bai, Zhen, Blackwell, Alan F., and Cou louris, George, Exploring expressive augmented
reality: The fingAR puppet system for social pretend play, Proceedings of the ACM
Conference on Hurnan Factors in Con1puting Systen1s, ACM Pre ss, New York (2015),
1035- 1044.

Barfield, Woodrow (Editor), Funda,nentals of Wearable Computers and Augmented Reality,
2nd Edition, CRC Press (2015).

Billinghurst, Mark, Clark, Adrian, and Lee, Gun, A survey of augmented reality,
Foundations and Trends in Hun1an Con1puter Interaction 8, 2- 3 (2014), 73-272.

Blascovich, Jim, and Bailenson, Jeremy, Infinite Reality: Avatars, Eternal Life, New Worlds,
and the Dawn of the Virtual Revolution, William Morrow (2011).

Boellstorff, Tom, Co1ning of Age in Second Life: An Anthropologist Explores the Virtually
Hu,nan, Princeton University Press (2008).

Boellstorff, Tom, Nardi, Bonnie, Pearce, Celia, and Taylor, T. L., Ethnography and Virtual
Worlds: A Handbook of Method, Princeton University Press (2012).

Calvo, Rafael, and Peters, Dorian, Positive Computing: Technology for Wellbeing and
Hu,nan Potential, MIT Press (2014).

Cockburn, Andy, and McKenzie, Bruce, Evaluating the effectiveness of spatial memory
in 2D and 3D physica l and virtual environments, Proceedings of the ACM Conference
on Human Factors in Cornputing Systerns, ACM Press, New York (2002), 203-210 .

Craig, Alan B., Understanding Aug,nented Reality: Concepts and Applications, Morgan
Kaufmann (2013).

Csikszentmihalyi, Mihaly, Floiv: The Psychology of Optin1al Experience, Harper & Row
(1990).

de Castell, Suzanne, Taylor, Nicholas, Jenson , Jennifer, and Weiler, Mark, Theoretical
and methodological challenges (and opportunities) in virtual worlds research, The
International Conference on the Foundations of Digital Gaines 12 (2012), 134-140.

Few, Stephen, Inforrnation Dashboard Design, 2nd Edition, Analytics Press (2013).

Fisher, Kristie, Nichols, Tim, Isbister, Katherine, and Fuller, Tom, Quantifying “magic”:
Learnings from user research for creating good player experiences on Xbox Kinect,
International Journal of Garning and Co1nputer-Mediated Si1n11lations 6, 1 (January – March
2014), 26-40.

Fuchs, Phillippe, Moreau, Guillaume, and Guitton, Pasca l, Virtual Reality Concepts and
Technologies, CRC Press (2011).

References 267

Hale, Kelly S., and Stanney, Kay M. (Editors), Handbook of Virtual Environn1ents: Design,
In1plementatio11, and Applications, 2nd Edition, CRC Press (2014).

Han, Jefferson Y., Low-cost multi -touch sensing through frustra ted total internal
reflec tion, Proceedings UIST ’05 Conference, ACM Press, New York (2005), 115-118.

Hicks, Martin, O’Malley, Claire, Nichols, Sarah, and Anderson, Ben, Comparison of
2D and 3D representations for visualising telecommunication usage, Behaviour &
lnforn1ation Technology 22, 3 (2003), 185-20 1.

Higuchi, Keita, Chen, Yinpeng, Chou, Philip A., Zhang, Zhengyo u, and Liu, Zich eng,
ImmerseBoard: Immersive telepresent experience using a digita l whiteboard,
Proceedings of the ACM Conference on Hu1nan Factors in Co,nputing Systerns, ACM
Press, New York (2015), 2383-2392.

Ishii, Hiroshi, Tangible user in terfaces, in Sears, Andrew, and Jacko, Julie (Editors), The
Hu1nan-Co1nputer Interaction Handbook, 2nd Edition, Lawrence Erlbaum Associates,
Hillsdale, NJ (2008), 469-487.

Jerald, Jason . The VR Book: Hu,nan-Centered design for virtual reality. Morgan & Claypool
(2016).

Johnson, Daniel, Nacke, Lennart E., and Wyeth, Peter, All about tha t base: Differing
player experiences in v ideo game genres and the unique case of MOBA games,
Proceedings of the AClv1 Conference on Hu1nan Factors in Co,nputing Systen,s, ACM
Press, New York (2015), 2265-2274.

Jones, Christian, Scho les, Laura, Johnson, Danie l, Katsikitis, Mary, an d Car rs, Michelle
C., Gaming ,ve il: Links bet\>veen videogames and flourishing mental hea lth, Frontiers
in Psychologi; 5 (March 2014), Article 260.

Kipper, Greg, and Rampolla, Joseph, Augrnented Reality: An Ernerging Technologies Guide
to AR, Syngress (2012).

Kofman, Ava, Dueling realities, The Atlantic Oune 9, 2015).

Kokalitcheva, Kia, Magic Leap files for a big pile of patents, including for a sci-fi contact
lens, Fortune (September 1, 2015).

Kristoffersson, Annica, Coradeschi, Silvia, and Loutfi, Amy, A review of mobile
robo tic telepresence, Advances in Human-Cornputer Interaction (2013), Article
902316.

Kulshreshth, Arun, and La Viola, Joseph J. Jr., Exploring 3D user interface technologies
for improving the gaming experience, Proceedings of the ACM Conference on Human
Factors in Computing Systems, ACM Press, New York (2015), 125-134.

Kushner, David, Is it live, or is it VR? Virtual reality’s moment, IEEE Spectru,n (Janu ary
2014), 34-37.

Laha, B., Sensharma, K., Schiffbauer, J. D., and Bowman, D.A., Effects of immersion
on visual analysis of volume da ta, IEEE Transactions on Visualization and Cornputer
Graphics 18, 4 (April, 2012).

Lee, Min Kyung, and Takayama, Leila, “Now, I have a body”: Uses and social norms
for mobile remote presence in the workplace, Proceedings of the ACM Conference on
Factors in Con1puting Systems, ACM Press, New York (2011), 33-42.

268 Chapter 7 Direct Manip u lat ion and lmmersive Environments

Lecky-Thompson, Guy W., Video Carne Design Revealed, Charles River Media, Boston ,
MA (2008).

Lucero, Andre s, Karapanos, Evange los, Arrasvuori, Juha, and Horhonen, Hannu,
Playful or gameful? Creating delightful user experiences, ACM Interactions,
May-Jun e (2014), 34- 39.

McGill, Mark, Boland, Daniel, and Murray-Smith, Roderick, A dose of reality:
Overcoming usability challenges in VR head -mounted displays, Proceedings of the
ACM Conference on Hun1.an Factors in Con1puting Systerns, ACM Press, New York
(2015), 2143-2152.

McGonigal, Jane, Reality Broken: V·lhy Gaines Make Us Better and How They Can Change the
World, Penguin Press (2011).

Metz, Rachel, What’s it like to try Magic Leap’s take on virtual reality? MIT Technology
Review (March/ April 2015).

Milgram, P., and Kishino, F. A., Taxonomy of mixed reality visual displays, IECE Trans­
actions on Inforn1ation and Syste,ns (Special Issue on Networked Reality) E77-D, 12, (1994),
1321-1329.

Mims, Christopher, Virtual reality isn’t just abou t games, The Wall Street Journal (August
2, 2015).

Minsky, Marvin, Telepresence, OMNI Magazine (June 1980).

Murphy, Robin R., Disaster Robotics, MIT Press (2014).

Ossola, Alexandra, Could analyzing how humans think make better video games?
Popular Science (February 17, 2015).

Pagulayan, Rand y J., Keeker, Kevin, Fuller, Thoma s, Wixon, Dennis, Romero, Ramon,
and Gunn, Daniel V., User-cen tered design in games, in Jacko, Julie (Editor), The
Hu1nan-Co1nputer Tnteraction Handbook, 3rd Edition, Taylor and Francis /C RC Press
(2012), Chapter 34, 795-824.

Parkin, Simon, An Illustrated History of 151 Video Gan1es, Lorenz Books (2014).

Poupyrev, Ivan, Tan, Desney S., Billinghurst, Mark, Kato, Hiroka zu, Regenbre cht, Hol­
ger, and Tetsutani, Nobuji, Deve loping a generic augmented -rea lity int erface, TEEE
Co,nputer 35, 3 (March 2002), 44-50.

Rae, Irene, Mutlu, Bilge, and Takayama, Leila, Bodies in motion: Mobility, presence, and
task awareness in telepre sence, Proceedings of the ACM Conference on Hun·,an. Factors in
Computing Systems, ACM Pre ss, New York (2014), 2153-2162.

Rae, Irene, Venolia, Gina, Tang, John C., and Molnar , David, A framework for
understanding the d esigning telepresence, Proceedings Co1np11ter Supported Cooperative
Work and Social Con1puting (CSCW) (2015), 1552-1566.

Shneiderman, Ben, Dire ct manipulation: A step beyond programming languages, IEEE
Computer 16, 8 (August 1983), 57-69.

Sonnenwald, Diane H., Soderholm, Hanna Maurin, Welch, Gregory F., Cairns , Bruce A.,
and Fuchs, Henry, Illuminating collaboration in emergency health care situations:
Paramedic-physician collaboration and 3D telepresence technology, Information.
Research 19, 2 (June 2014).

References 269

Stein, Joel, Inside the box, the surprising joy of virtual reality, Tirne (August 17, 2015).

Thomas, Bruce H., A survey of visua l, mixed, and augmented reality gaming, ACM
Co-rnputers in Entertainrnent 10, 3 (November 2012), Article 3.

Weiser, M., The computer for the 21st century, Scientific American 265, 3 (1991), 94-104.

Wellner, P., Mackay, W., and Gold, R., Computer augmented environments: Back to the
real world, Co1nrnunications of the ACM 36, 7 (July 1993), 24-27 .

CHAPTER

•• A man is responsible for his choice and must accept the
consequences, whatever they may be . ”

CHAPTER OUTLINE
8. 1 Introduction

8.2 Navigation by Selection

8.3 Small Displays

8.4 Content Organization

8.5 Audio Menus

8.6 Form Fill-in and Dialog Boxes

W. H. Auden
A Certain World, 1970

271

272 Chapter 8 Fluid Navigation

8. 1 Introduction

This chapter addresses design issues related to navigation, which can be defined
as enabling users to know where they are and to steer themselves to their
intended destination. In short, navigation is about getting work done or having
fun through a series of actions, much like sailors who steer their boat to a harbor.
Navigation is key to successfully operating interactive applications, such as
installing a mob ile app, filling in a survey, or purchasing a train ticket (task
navigation). It is also the key to fiI,ding information on a website or browsiI,g
social media (web na, rigation) or to finding the action needed in a desktop
application (command menu navigation).

Navigation harnesses users’ ability to rapidly skim choices, recognize what is
relevant, and select what they need to realize their intentions. The goal for
designers is to enable fluid navigation that allows users to gracefully and confi­
dently get to where they want to go, explore novel possible routes, and back­
track when necessary. Na vigation depends on recognition of landmarks that
travelers use to guide their choices, wl,ich differs greatly from search, which
requires users to describe what they want by typing keywords in a blank search
box (see Chapter 15).

While the search box is the main technique to initiate the process of finding
information in vast informa tion spaces (like the internet or digital libraries),
navigation techniq ues such as small or large menus, embedded links, or tool
palettes are the workhorses of navigation. Users indicate their choices with a
touch, tap, or swipe of the finger or by using a pointing device (see Chapters 7
and 10) and get immediate feedback indicating what they hav e done. Nav igation
by selection is an interaction style that is especially effective for users who are
novice or first -time users, are knowledgeable intermittent users, or need help in
structuring their decision -making processes. However, with careful design of
complex menus and rapid interaction, menu selection can be appealing even to
exper t frequent users . These strategies can be used in combina tion with com­
mand languages (see Section 9.5), allowing users to transition smoothly from
novice to expert because menus offer cues to elicit recognition rather than
forcing users to recall the syntax of a command from memory. Careful design,
keyboard shor tcuts, and gestures alJow expert users to navigate quickly through
large information structures.

A loose definition of menus is used here as a representation of available choices,
which can describe the rich array of techniques designers use to present choices
and guide users as they select what they want. Arrays of check boxes or form fill-in
can be seen as primarily data-entry technique s, but those techniques contribute to
the experience of steering an application or website navigation (e.g., to complete a
survey, sign up for a service, or make a purchase), so they are discussed in this

8.1 Introduction 273

chapter as well. Similarly, dialog boxes contribute to allowing users to express their
choices, so dialog box design is described at the end of the chapter.

Very early studies demonstrated the importance of organizing menus in a
meaningful structure, resulting in faster selectiort time and higher user satisfac­
tion (see Section 8.4). Navigation may follow a linear sequence (e.g., in a wizard
or survey), a hierarchical structure that is natural and comprehensible (e.g., an
ebook split into chapters, a store into departments, or the animal ki.J.1gdom into
species), or a network strt1cture when choices may be reachable by more than
one path (e.g., websites).

By harnessing the latest versions of HTML or CSS, even webpages and mobile
applications now include smooth animations and sleek graphic design that turn
basic menus into custom widgets that help define the entire look and feel of a
website or application. When links and menus or cl1oices and commands are
designed using familiar terminology or recognizable visual elements and are
organized in a meaningful structure and sequence, users can navigate complex
information structtu es eas ily with a few mouse clicks or tap s of the finger or
smoo thly scroll through sleek presentations of the possible next steps to accom­
plish their tasks. Carefully selected gestures can add a sense of delight and flu­
idness to the navigation on touchscreen devices.

Of course, just because a designer uses slick graphical menus, elegan t form
fill-in, or well-known gestures does not guarantee that the interface will be
appealing and easy to use. Effective interfaces emerge only after careful consid­
eration of and testing for numerous design issues, such as task-related organiza ­
tion, phrasing of items, seqt1ence of items, graphic layout and design, responsive
design to adapt to vario us sizes of devices, sho rtcut s for knowledgeable fre­
quent users, online help, and error correction (Bailly et al., 2015).

This chapter starts by reviewing the rich array of availab le techniques for
allowing users to specify their choices, from single techniques to the combina­
tions of multip le techniqu es (Section 8.2). Section 8.3 discusses issues related to
small displays. Content organization is discussed in Section 8.4. Finally, Section
8.5 discusses the needs of audio menus, and form fill-in and dialog boxes are
covered in Section 8.6.

See also:

• Chapter 10, Devices

• Chapter 12, Advancing the User Experience

• Chapter 14, Documentation and User Support (a.k.a. Help)

• Chapter 16, Data Visualization

274 Chapter 8 Fluid Navigation

8.2 Navigation by Selection

Choices can be presented explicitly, in that there is an orderly enumeration of
the items with little extraneous information, or they can be embedded in text or
graphics and still be selectable. Embedded links of webpages were first popular­
ized in the Hyperties system (Koved and Shneiderman, 1986), which v.ras used
for early commercial hypertext projects and became the inspiration for the
hot/inks of the World Wide Web. Highlighted names, places, or phrases became
menu items embedded in text that informs users and helps to clarify the meanit1g
of the menu items. Graphical techniques are a particu larly attractive way to
present choices while providing context to help users specify what tl1ey want.
For example, maps can orient users about the geography of the area before users
select an item of interest, and calendars or timelines can inform users of
availability and constraints before a date or time is selected (e.g., see HIPMUNK
in Fig. 1.7). Interactive visualization of information can also help analysts
navigate large amount of data in a fluid visual manner (Elmqvist et al., 2011 and
Chapter 16).

The simplest case of explicit menus is a binary m.enu for yes/no, true/false
choices (Fig. 8.1).

Another example of a simple menu is the grid menu popularized by mobile
devices, with a small set of icons and labels (Fig. 8.2).

When users 11eed to make a series of choices (e.g., in a survey or to select
parameters of an application), there are well -established methods of presenting
choices.

Radio buttons support single-item selection from a multiple-item menu
(Fig. 8.3), while check boxes allow the selection of one or more items in a menu. A
multip le-se lection menu is a convenient method for handling mu ltip le binary

FIGURE 8. 1

For an extra $5
you can add a gift wrap
selected from dozens of choices

[ Add gift wrap ) No t hanks

A simp le menu with two choices. A short exp lanation is provided. Buttons are large
enough to be easy to select and have informative labels, and one answer has been
highlighted as the most like ly answer.

••••o AT&T 9 16 : 58 @ * 93% a.)

( NatureNet Activitie s G

Ask a Naturalist <D
Tracks <D
Native or Not? <D
Free Observation <D
Snow Study <D
Red Mounta in <D
How Many Mallards? <D
Heron Spott ing <D
Who's Who? <D

ACTIVIT IES OUTSIDE ACES

My Backyard acid noties

♦–.

8.2 Navigation by Selection 277

He p •
Q. • Starch In Present.nlon

….

lnspred
EHRs

Slide 24 ol 2•

,· ai.·
Quiet S:yl.u ~ •

AO•

I
!

112″ ~ X

On the top menu bar of Microsoft PowerPoint, the Edit cascading pull-down menu
(also called pulled-right) is open, followed by the Find menu. The menus allow users
to exp lore the functions of the applicat ion. To faci litate discovery and learning, icons
and keyboard shortcuts are indicated on the right of the menu items (for example,
~C for Copy or ~F for Find). A small black triangle indicates that selection of the
menu item will lead to a submenu. Three dots( … ) indicate t hat the selection will
lead to a dialog box. Partial ly hidden behind the Edit menu, the application ribbon is
visible, revealing the large number of choices available in the selected tab (Format).

\hfebsite lists all the Cycle subcategories in a large menu that expands to the
right, filling mo st of the screen; see Fig. 8.6).

The limited screen space of mobile devices leads designers to strive to limit
the number of menu items. To leave more room for content, most or all menu
items can be moved into a separate screen that is accessible from a main menu
icon, sometimes called the hamburger menu icon § for its shape and which can
be placed on every screen (Fig. 8.7).

Toolbars, iconic 1nenus, and palettes can offer many actions that users can select
with a click and apply to a displayed object (Fig. 1.10). A large number of tool­
bars can be overwhelming, so users need to be able to customize which toolbars

278 Chapter 8 Fluid Navigation

FIGURE 8.6

‘”‘· ,~ –~ …
…..
_ … Oft -·-__ .., ……
• …,_._b ……. _ _.,_.,_
(idoMiif”«.

~–11-•C,C,”‘t

O<!,t .. ()I) ……..
~""'"°,o (U)

<I~ d"l4nt (2'1
,_,..,._ ..,. lob Q 1:1 –
(> Q E:J=
1>9…P-.
() Q eJ c. …… u,s

(> Q fJ Sldt.”fl

I> oo-
1> t) (ii ,.,,.
() q (i R,ohl l

I> 0 “”‘”
(> ~· 0 G,02″1

I> 0 -·
(> i;j O Bltlde Crde 1

FIGURE 8.8

C R<epc.C l'Jttt, FIA

De;ete Q !(} frcuf'll1

.., … ….,

''"" . I ""-,9ietlJ.1.cier111
A,ppe-••"""

litics
rideshare
volunteers

personals
stnctty platonic
women seek women
women seeking men
men seeking women
men seeking man
misc romance
casual encounters
missed connections
rants and raves

discussion forums
apple .,,.
atheist ·­~uly
l>l•
comp
cn,lls
diet
divorce
dying
aco
aduc
feedbk
film
filne$$
fi>dt

help
OiStoty
housing
jobs
jokes
kink
legal
llnux
m4m
manners
marriage
media
money
motocy
music
nonproftl
open
OUldoor

photo
p.o.c.
polltlcs
psych
queer

“””””‘ renglon
romance
science
spirit
spans
tax
travel
Iv
vegan
w4w
wad
wine

housing
apts I housing
housing swap
housing wanted
office / commercial
parking I storage
real estate for sale
rooms I shared
rooms wanted
sublets / temporary
vacation rentals

for sale
antiques
appliances
arts+crafts
atv/utv/sno
auto parts
baby+kid
barter
beauty+hlth
bikes
boats
books
business
cars+trucks
cds/dvdl\lhs
cell phones
clolhes+acc
collectibles
oomputers
electronics

farm+garden
free
fumlture
garage sale
general
heavy equip
household
jewelry
materials
motorcycles
music instr
photo+video
rvs+camp
sporling
tickets
tools
toys+games
video gaming
wanted

automotive
beauty

services
legal
lessons

jobs
accountlng+finance
admln / office
arch I engineering
art I media I design
biotech / science
business/ mgmt
customer servioe
education
rood/ bev / hosp
general labor
government
human resources
Internet engineers
legal / paralegal
manufacturing
marketing I pr I ed
medical I health
nonprofit sector
real estate
retail / wholesale
sales/ biz dev
salon/ spa/ fitness
S80.Jrity

sldlled trade I craft
software I qa / dba
systems/ networlc
technical support
transport
tv / film I video
web/ info design
,witi ng I editing
[ETC]
[ part-time ]

gigs

( english

nearby cl

allentown
altoor,a

annal)Ob
betlimore
cen1ral nj

chartottesvllle
<Unberland val

de!eware
eastern sho,e

easlernwv
frederick

frederick:sborg
harrlsburg

ha.rriSOt\burg
jersey shore

lancastar
lynchburg

mo,gan1own no-
philadefphia

pocon ..
reading

richmond
southemm d
south Jersey
state OOllogo
westemmd
williamsport
wincheSlet

yo,!<

us states

canada

cl wortctwlde

The craigslist home page is a text -only, 2-dimensional mega menu. It al lows users
to rapidly read hundreds of choices wi t h little or no scro ll ing requ ired. Items are
organized h ierarchically. (http://www.craigsl ist.org/)

8.2 Navigation by Selection 285

engines. Similarly, a site map lists every single page of a website and is usefu l as
a table of contents.

With such compact text-oriented designs-as well as with all other more
graphic-oriented designs – accessibility issues need to be addressed (Fig. 2.1).

Users browsing user -generated content such as pho to or document collec­
tions also need to choose among non-curated lists of terms or tags attached to
items in the collections. Tag clouds were fashionable until recently as compact
2-dimensional text menus. In tag clouds, the larger the font size of the tag, the
more items are available. While attractive and fun, tag clouds are often misin ter­
preted because longer tags have more prominence than short ones and users
believe that the position in the tag cloud has meaning even when it does not. To
address this probl em, tag indexes are now gaining popularity with tags sorted
by number of items so users make no mistakes wl1en looking for tl1e tags that
have the most items (Fig. 8.12). A horizontal layout may be convenient when the
list is long, but arranging the tags vertically will facilitate scanning of the list.

8.2.4 Linear versus simultaneous presentation

Often, a sequence of interdependent menus can be used to guide users through
a series of choices. For example, a pizza-ordering interface might include a lin­
ear sequence of menus in which users choose the size (small, medium, or large),
thickness (thick, normal, or thin crust), and finally toppings. Other fami liar
examples are online examinations that have sequences of multiple-choice test
items, each made up as a menu , or wizards (a Microsoft term) tl1at steer users
through software installation by presenting a sequence of menu options. Linear
sequences guide users by presenting one decision at a time and are effective for

-AWWWARD S'" SUBMIT YOUR SITE –

Site of The Day CategOty v Tag A Color v Country v — .. .-, == .n ._.

HTMLS (7301 Oean (592) CSS3 (573) ResponsJve Design (474) Design (43,4) JOuery 1363) Animation (480) Fu\!$Creen (372) Mlrumal (339)

Typography (291) V,deo 1264) e.g Background lrNges <26U Unusu11I NaviQ&lion 125U 1nr,nite Serou C2141 Single p;,ge 1211)

Photography (199) Flexible 1195) Colorful 1188) Parallax (148) Graphic design (146) Scroll (1361 WordPress (113) Trend 192) Rat Design 191)

Social Media (89) Bright (83) Texture (BlJ WebGl (80) Navigation 179) Icons 176) Retro (461 Vector (40) SVG C3n E~Commerce (34)

CMS (23) App Style 120) Orupal (13) Horizonta l Layout (1lJ CSS Framework (9) Web Fonts (9) WebSocket (7l SEO (3) r-) Popular

FIGURE 8.12
Awwwards.com gives awards to a large number of websites, which are tagged.
A tag index at the top of the page displays all the tags sorted by total count. The
counts are indicated in parenthesis . The green-colored tags are the popular tags t hat
have been selected more often (which most likely will lead to even mo re selection) .

286 Chapter 8 Fluid Navigation

~ • I And greot gear ond clotMng

SHOP REI SHOP REI OU T LET

STEWARDSHIP

p
TR.AVEL WITH REI LEARN BLOG G

wetcome to REIi I w.J.Q or ReaiStcr

FREE SHIPPING Wrtll $50 mirimum purchase.

MEMBERS HIP

Cam p & Hike Climb Cycle Rtness Run Padd le Snow Travel Men Women Kids Footwear More Deals

SAVE UP TO 30 % Shop the REI 41h ol Ju ly Sa le & Cleara nce throug h July 6 Get the dea ls now '

• Categories

8.ackpacklng Tents ( 160)

Shelters {69)

Camping Tents (60)

Hammock Tents (14)

Bivy Sacks (9)

Ten t Accessories (23 1)

• Sleeping Capacity

1-pcr-son (S6)

2-pcrson (94)

3-person (44)

4-person ( 43)

s-person (1)

6-person { 19)

a+ people (B)

• Brand

ALPS Mountai neering (12)

Big Agnes (127)

Brunton ( 1)

Cadd is (8)

CGear Multim ats (3 )

Coghlan's (9)

Coleman (2)

See 33 More

• Seasons

2-scason (5)

3 – 4-sc.oson (8)

3-sca.son (240)

4-se.uon (30)

FIGURE 8. 13

Results for "tent" (541 matches)

Relevance : : Items per Page : 30 90 · 1 2 3 4 S 6? 8 9 10 .. . 19 t
– – -. .. . . ……….. .

;–
***** (19) ' ***** (6)
Big Agnes copper Spur UL 2 Tent . Mllrmot Tungsten JP Tent

$299.99 $ 399 .9 S : $179.99 $249.99

You save 24 ~ : You save 27~

I Compare I ; I compare '

ONLY
ATREI

• • • •
***** (14) : ***** (122)

–***** (8)
Marmot Tungsten 2P Tent

$149.99 $ 199 .09

You save 24~

comp.are

ONLY
AT REI

***** = 30].

• Make sure that items are non-overlapping: e.g., use “Concerts” and
“Sports” over “Entertainment” and “Events”.

• Arrange items in each branch by natural sequence (not alphabetically) or
group related items.

• Keep ordering of items fixed (or possibly dup licate frequent items in
dedicated sections of the menu).

collection of 10,000 destinations. That number would be excessive ly large for
a word processor but is realistic in a newspaper, a library, or an enterprise
web portal.

If the groupings at each level are natural and comprehensible to users and if
users know the target, menu traversal can be accomplished in a few seconds-it
is faster than flipping through a book. On the other hand, if the groupings are
unfamiliar and users have only vague notions of the items that they’re seeking,
they may get lost for 1-lours in the tree ment1s. Terminology from the user’s task
domain can help orient the user: Instead of using a title that is vague and empha­
sizes the computer domain, such as “Main Menu Options “, use terms such as
“Friendlibank Services” or simply “Games”.

Menus using large indexes, such as library subject headings or comprehen­
sive business classifications, are cl,allenging to navigate, making search a
valuable alternative (Chapter 15).

The depth, or number of levels, of a menu tree depends ii, part on the breadth,
or number of items per level. If more items are put into the main menu, the tree
spreads out and has fewer levels. Tlus shape may be advantageous, but 01’\ly if
clarity is preserved. Several authors urge using four to eight items per menu,

292 Chapter 8 Fluid Navigation

but at the same time, they urge using no more than three to four le,,els. With
large menu applications, one or both of these guidelines must be compromised.

Many empirical studies have dealt with the depth/breadth tradeoff
(Cockburn and Gu twin, 2008), and the evidence is strong that breadth should be
preferred over depth as long as users can anticipate target location at each level.
The navigation problem (getting lost or using an inefficient path) becomes more
and more treacherous as the depth of the hierarchy increases. Of course, screen
clutter must be considered in addition to the semai1tic organization. Give11
sufficient screen space, it is possible to show a large portion of the menu
structure and to allow users to rapidly point in the flattened tree structure
(Figs. 8.6 and 8.11).

Although tree structures are appealing, sometimes network structures are
more appropriate. For example, in online shopping, it might make se11Se to
provide access to bat’lking information from both the personal profile and the
checkout section of a link structure. A second motivation for using menu networks
is that it may be desirable to permit paths between disparate sections of a tree
rather than requiring users to begin a new traversal from the main menu. It is
helpful to provide site maps and to preserve the notion of levels, as users may feel
more comfortable if they have a sense of how far they are from the main menu.

8.4.2 Sequence, phrasing, and layout

Sequence Once the items in a menu have been chosen, the designer is still
confronted with the choice of presentation sequence. If the items have a natural
sequence-such as days of the week, chapters in a book, or sizes of eggs-the
decision is trivial. Many cases have no task -related ordering, though, so the de ­
signer must choose from either alphabetic order, grouping of related items, and
most frequently used items first. Categorical organization is generally prefer­
able over alphabetical. Using frequency of use does speed up selection of the
topmos t items, but the loss of a meaningfu l ordering for low-frequency items
may be disruptive, so it is best limited to small lists. Varying the sequence adap­
tive ly to reflect the current pattern of use has been shown to be disruptive, in­
creasing confusion and selection time. In addition, users may become anxious
that other changes might occur at any moment, undermining the users’ learning
of menu structures. To avoid disruption and unpredictable behavior, it is wise
to allow users to specify if and when they want the menu restructured. A sen­
sible compromise is to extract three or four of the most frequently selected items
and put them near the top whi le preserving the order of the remaining items.
This split-1nenu strategy proved appealing, statistically significantly improved
performance, and has been adopted by commercial software (Fig. 8.15).

Adaptable n1enus (i.e., providing users with control over the sequence of
menu items) is an attractive alternative to adaptive menus that adapt

Apple Casua l f· .I ll I· B T Tl (]q :,;:: =; ;c;

Font Collect lons ►

Cal1bri l ight (Theme Headings)

Calibri (Theme Body)

.,/ Apple Con.cl

Arial Black

Times New Roin:m

lo .. •w•s

.U.acli NT CoNa.rtsad &t ra lol4

Ab.id, MT c.nd”””d ~ii>’

Ac:idemy Engn,·td LET

AmeMcan Typewriter

Andale Hono

.,/ Appl, Con.o.l

FIGURE 8.15

Example of adaptive split menus
in M icrosoft Office. A font•
selec t ion menu lists the theme
fonts and then the recently
used fonts near the top of the
menu (as well as in the full list),
making it easier to quickly select
th e popular fonts. A thin line
separates the sections.

8.4 Cont ent Organization 293

automatica lly. One study compared the
Microsoft Word version using adaptive
menus with a variant providing users with
the abili ty to swi tch between two modes
of operation: the normal full -featured
mode and a personal mode that users
could customize by selecting which items
were included in the menu s (McGrenere et
al., 2007). Results showed that participan ts
were better able to learn and navigate
through the menus witl1 the personal ly
adaptable version. Preferences varied
grea tly among users, and the s tudy
revea led some users’ overal l dissatisfac ­
tion with adaptive menus but also the
relu ctance of others to spend significant
time cus tomizing the interface. Novel
approaches have used ephemeral adapta ­
tion (Findlater et al., 2009) to help users
quickly identify imp ortant commands.
With thi s technique, a small subset of
menu items is immedi ately shown when
the menu is displayed, while the remain ­
ing items are gradually faded into view
over a few hundr ed milli seconds.

Phrasing For sing le menus, a simple descriptive title that identifies the situ ­
ation is all that is necessary. For tree-structured menus, choosing title s is more
difficult. One helpful rule is to us e the words used for the menu items as the
titles for the submenu or next pages. For example, it is reassuring to users to
find that when they select “Business and financial services”, they are shown a
display that is titled “Business and financial services”. It might be unsettling to
get a display titled “Managing your money”, even though the intent is similar.
For webpages, a distinctive short title displayed as browser tab label w ill help
users return to the page after they visi t other tabs. A distinctive icon improves
the tab label as well.

Just because an interface has words, phrases, or sente nces as menu choices is
no guarantee that it is comp rehensible or provides adequa te information scent
(see Section 3.4 on theories).

Individual words (for example, “expunge”) may not be familiar to some
users, and often two menu items may appear to satisfy the user’s needs when
on ly or1e actually does (for example, “disconr1ect” or “eject”). This enduring
problem has no perfect solution, but designers can gather useful feedback from

294 Chapter 8 Fluid Navigation

colleagues, users, pilot studies, acceptance tests, and user-performance monitor­
ing. The following directives may seem obvious but are listed here because they
are so often violated:

• Use farniliar and consistent ter,ninology. Carefully select terminology that is
familiar to the designated user community and keep a list of these terms to
facilitate consistent use .

• Ensure that items are distinct fro1n one another. Each item should be
distinguished clearly from other items. For example, “Slow tours of the
countryside”, “Journeys with visits to parks”, and “Leisurely voyages” are
less distinctive than are “Bike tours”, “Train tours to national parks”, and
“Cruise-ship tours”.

• Use consistent and concise phrasing. Review the collection of items to ensure
consistency and conciseness. Users are likely to feel more comfortable and
to be more successful with “Animal”, “Vegetable”, and “Minera l” than with
“Information about animals”, “Vegetable choices you can make” and “View­
ing mineral categories”.

• Bring the key.vord to the fore. Try to write menu items such that the first word
aids the user in recognizing and discriminating between items- us e “Size of
typ e” instead of “Set the type size”. Then, if the first word indicate s that this
item is not relevant, users can begin scanning the next item.

Layout While the layout of applications and websites can be assisted by
the use of templates and website management tools, designers who establish
guidelines for consistency across dozens or hundreds of screens will reduce
users’ anxiety by offering pr edictability (see Section 3.2). The following elements
can be includ ed:

• Titles. Some people prefer centered titles, but left justification is also
acceptable.

• lte1n placement. Typically, items are left justifi ed, with the item numb er or
letter preceding the item description. Blank lines may be used to separate
meaningful groups of items. If multiple colLunns are used, a consistent pat ­
tern of numbering or lettering should be used (for example, it is easier to scan
down columns than across row s). See also Section 12.2 on disp lay design.

• Instructions. The instructions should be identical in each menu and should be
placed in the same position. This rule includes instructions about traversals,
help, or function-key usage.

• Error n1essages. If the users make unacceptable choices, the error messages
should appear in a consistent position and should use consistent terminology
and syntax. Graying out unacceptable choices will help reduce errors.

8.5 Audio Menus 295

Since disorientation is a potential problem, techniques to indicate position in
the menu st ructu re can be useful. In books, different fonts and typefaces may
indicate chapter, section, and subsection organization. Similarly, in menu trees,
as the user goes dow11 tl1e tree structure, the titles can be designed to indicate
the level or distance from the main menu. Graphics, fonts, typefaces, or high­
lighting techniques can be used beneficially. For example, this set of headers
from the Library of Congress collections webpages gives a clear indication of
progress dow11 the tree:

BROWSE BY TOPIC
Sports, Recreation & Leisure
Baseball
Baseba ll Cards 1887- 1914

\A/hen users want to do a traversal back up the tree or to an adjoining menu at
the same level, they wil l feel confident about what action to take.

8.5 Audio Menus

Audio menus found in interactive voice response (IVR) systems (Lewis, 2010) are
useful when hands and eyes are busy, such as when users are driving or testing
equipment and are ubiquitous in phone surveys or services and public-access
situations that need to accommodate blind or vision-impaired users, such as
information kiosks or voting machines.

With audio menus, instruction prompts and lists of options are spoken to
users, who respond by using the keys of a keyboard or phone or by speaking.
While visual menus have the distinct advantage of persistence, audio menus
have to be memorized. Similarly, visual highlighting can confirm users’ selec­
tions, while audio menus have to provide a confirmation step following the
selection. As the list of options is read to them, users must compare each pro­
posed option with their goal and place it on a scale from no match to perfect
match. To reduce dependence on short-term memory, it is preferable to
describe the item first and then give the number. A way to repeat the list of
options and an exit mechanism must be provided (preferably by detecting
user inaction).

Complex and deep menu structures should be avoided. A simple guideline is
to limit the number of cl1oices to three or four to avoid memorizatio11 problems,

296 Chapter 8 Fluid Navigation

but this rule should be re-evaluated in light of the application. For examp le, a
theater information system will benefit from using a longer list of all the movie
titles rather than breaking them into two smaller, arbitrarily grouped menus.
Dial-al1ead capabilities allow repeat users to skip through tl1e prompts. For
example, users of a drugstore telephone menu might remember that they can
dial 1 followed by Oto be connected to the pharmacy immediately without hav­
ing to listen to the store’s welcome message and the list of options.

Voice recognition has finally reached an acceptable recognition rate and
enables users to speak their options instead of pressing letter or number keys
(see Section 9.2). Most systems still use numbered options to allow both keypad
and voice entry (e.g., “To hear the options again, press or say nine”), but it leads
to longer prompts and longer task-completion times.

To develop successfu l audio menus, it is critical to know the users’ goals,
make the most common tasks easy to perform rapidly, and keep prompts to a
minimum (e.g., avoid permanent “Listen carefully, as our menu options have
recen tly changed.”) . See Chapter 9, in particular Section 9.2, for more discussion
of interactive voice response (IVR) systems.

8.6 Form Fill-in and Dialog Boxes

Selection is effective for choosing an item from a set of choices, but if the entry of
names or numeric values is required, typing becomes more attractive. When
many fields of data are necessary, the appropriate interaction style is form fill-in
(Fig. 8.16). The combination of form fill-ins, menus, and custom widgets such as
calendars or maps suppor ts rapid navigation for a vas t array of applications
from airline -ticket booking to triage of new patients in the emergency room.

8.6.1 Form fill-in
There is a paucity of empirical research on form fill-in, but several design guide ­
lines have emerged from practitioners (Jarrett and Gaffney, 2008). Software tools
simplify design, help to ensure consis tency, ease main tenance, and speed imple­
mentation, but even with excellent tools, the designer must still make mai1y
complex decisions.

The elements of form fill-in design include the following:

• Meaningful title. Identify the topic and avoid computer terminology.

• Comprehensible instructions. Describe the user’s tasks in familiar termin ology. Be
brief; if more information is needed, make a set of help screens available to the
novice user. A useful rule is to use the word “type” for entering information

Create an IEEE Account 0

* Required field

Provide your personal Information

* Given/First name:

[Catherine

Middle name:

* Last/Family/Surname:

Set security questions

8.6 Form Fill-in and Dialog Boxes 297

Enter e•mall address & password

The e–mall provided here will be the usemame of your aooount.

* l!:•mail address:

lcpfalsantO

* Re•enter e•mail address:

* Password:

1 …………. .

* Confirm password:

• The e~mall address provided ls not In
a valid e•mail format (for example:
j.doe@noma il.c.om). Please try again .

• Your p,e9’WOl’d IS good

Passwords must be between 8
and 64 characters, and include
at least one number. More …

X

For your security, IEEE Aooounts are required to h-ave two
security Questions and answers.

* Security question 1:

* Type your answer:

* Security question 2:

* Type your answer:

> Privacy & Opting Out of Cookies

FIGURE 8.16

. .

Create Account and continue Joining >Cancel

This form fi ll-in allows users to enter information when joining the IEEE Society .
Fields are grouped meaningfully, and field-specific rules such as password
requirements are provided next to the fields. The information is validated as it is
provided (as opposed to when the form is submitted), and error messages explain
how to correct problems (http://www.ieee.org).

and the word “press” for special keys such as the Tab, Enter, or cursor ­
movement (arrow) keys. Since “Enter” often refers to the special key with that
name, avoid using it in the instructions (for example , do not use “Enter the
address”; instead, stick with “Type the address”). Once a grammatical style for
instructions is dev eloped, be careful to apply that style consistently.

• Label the fi elds. Place the label in a consistent location (e.g., top or left of the
field). A less desirable location is to place labels inside the fields, using a grayed­
out font. It saves space, but the labels disappear as soon as users start typing,
requiring users to remember what is needed, which often leads to errors.

• Limit data entnJ. Make sure all fields are really needed. Carefully set default
value s (e.g., use the current location). This is particularly important for small
displays (see Box 8.4)- for examp le, using on ly the zip code instead of the city

298 Chapter 8 Fluid Navigation

BOX 8.4
Additional form fill -in guidelines for small displays.

• Include only critical data fie lds.

• Break long forms in multiple smaller ones .

• Use sensib le defaul ts (e.g., current location or date) .

• Place short labels on top of the fie lds, not to their left.

• Set the touch keyboard to match the data (e.g., numeric keyboard to enter
a number) .

and state. Maybe only a single phone number is enough, instead of asking for
several alternatives. Some fields may be removed entirely and reserved only
for large devices.

• Explanatory messages for fields. Information about a field ( e.g., “Your e-mail ad ­
dress will be the user name of your account”) or its permissible values should
appear in a standard position, such as next to or below the field, preferably
using a different font and style.

• Error prevention. Where possible, prevent users from er1tering incorrect
values. For examp le, in a field requiring a whole number, do not allow the
user to enter letters or decimal points.

• Error recovery. Summarize errors at the top of the page. Highlight errors in
the form. If users enter unacceptable values, indicate permissible values for
the field; for example, if the zip code is entered as 281,

,,.
~

,,

&WARN ING! Drug – Drug Interaction

Warfarin – Aspirin
Increased risk of bleeding @g!.!ideli □e~

Management

Aspirin Keep Aspirin, do not order Warfarin

Warfarln Keep Warfarin, cancel Aspirin

Over de Order both Warfarin and Aspirin D Confirm ovemde

Check INR frequently and advise patient
fo, warning signs ol bleeding

Cancel
PrQ!lld!! feedback Q!l !his i!l!!r:l

Bailly, G., Lecolinet, E., and Nigay, L., Visual menu techniques, Researc h Report hal-
01258368, Te lecom Paris Tech (2016) https: / /hal.archives -ouvertes.fr /ha l-01258368

Bailly, G., and Oulasvirta, A., Toward optimal menu design, Interactions 21, 4 (2014),
40-45.

Bonneau, J., Her ley, C., van Oorschot, P. C., and Stajano, F., Passwords and the evo lu­
tion of imperfect authentication, Co,nmunications of the ACM 58, 7 (2015), 78- 87.

Cockburn, A., Gutwin, C., Scarr, J., and Malacria, S., Supporting novice to expert
transitions in user interfaces, ACM Con1put. Surv. 47, 2 (2014), 36 pages.

Cockburn, A., and Gutwin, C., A predictive mode l of human performance w ith
scrolling and hierarchical lists, Hu1nan Computer Interaction 24, 3 (2008), 273-314.

Elmqvist, N., Vande Moere, A., Jetter, H.-C., Cemea, D., Reiterer, H., and Jankun-Kelly,
T., Fluid interaction for information visualization, Infor111ation Visualization 10,
4 (2011), 327- 340.

Find later, L., Moffat t, K., McGrenere, J., and Dawson, J., Ephen,eral adap tation: The
use of gradual onset to improve menu selection perforn1ance, Proceedings of the SIG­
CHI Conference on Hu,nan Factors in Con1puting Systems, ACM Press, New York (2009),
1655- 1664.

308 Chapter 8 Fluid Navigation

Gutwin, C., Cockburn, A., Scarr, J., Malacria, S., and Olson, S. C., Faster command selec­
tion on tablets vvith FastTap, Proceedings of the SIGCHI Conference on Human Factors in
Co1nputing Syste-ms, ACM Press, New York (2014), 2617- 2626.

Hornbrek, K., and Her tzum, M., Un tangling the usabil ity of fisheye menus, ACM
Transactions on Cornputer-Hiunan Interaction 14, 2 (2007), 6.

Hutchings, D. R., and Stasko, J., Consistency, multiple monitors, and multiple windows,
Proceedings SIGCHI Conference on Hun1an Factors in Con1puting Syste-ms, ACM Press,
New York (2007), 211- 214.

Jarr ett, C., and Gaffney, G., Forms That Work: Designing Web Fonns for Usability, Morgan
Kaufmann (2008).

Koved, L., and Shneiderman, B., Embedded menus: Menu selection in context,
Comn·zunications of the ACM 29 (1986), 312- 318 .

Krug, S., Don’t Make Me Think: A Conunon Sense Approach to Web and Mobile Usability,
New Riders (2014).

Lewis, J., Practical Speech User lnte-rface Design, CRC Press (2010) .

Malacria, S., Bailly, G., Harrison, J., Cockburn, A., and Gutwin, C., Promoting hotkey
use through rehearsal with ExposeHK, Proceedings of the SIGCHI Conference on Hun1an
Factors in Cornputing Systerns, ACM Press, New York (2013), 573- 582 .

McGrenere, Joanna, Baecker, Ronald M., and Booth, Kellogg S., A field evaluation of
an adaptable two-interface design for featur e-rich software, ACM Transactions on
Con1puter-Human Tnteraction 14, 1 (2007), 3.

Medhi, I., Toyama, K., Joshi, A., Athavankar, U., and Cutrell, E., A comparison of list vs.
hierarchical Uls on mobile phones for non-literate users interface layout and data
entry, Proceedings of JFJP 1NTERACT’13: Hurnan-Con1puter Interaction 2 (2013), 497-504 .

Oh, U., and Findla ter, L., The challenges and pote n tial of end-user gestur e customization,
Proceedings of SIGCHI Conference on Hiunan Factors in Cornputing Sys terns, ACM Press,
New York (2013), 1129-1138.

Shay, R., Bauer, L., Christin, N., Cranor, L. G., Forget, A., Komanduri, S., Mazurek, M. L.,
Melicher, W., Segreti, S., and Ur, B., A spoonful of sugar? The impact of guidance
and feedback on password -crea tion behavior, Proceedings SIGCHI Conference on
Hu1nan Factors in Cornputing Systerns, ACM Pre ss, New York (2015), 2903-2912.

Wigdor, Dani el, and Wixon, Dennis, Brave NUI World: Designing Natural User Interfaces
for Touch and Gesture, Mo rgan Kaufmann, San Fran cisco, CA (2011).

Wrob lewski, L ., Mobile First, A Book Apart (2011).

Zhai, S., Kristensson, P. 0., Appert, C., Andersen, T. H., and Cao, X., Foundational
issues in touch-screen stroke gesture design: An integrative review, Foundations
and Trends in Human-Camputer Interaction, The essence of knowledge, 5, 2 (2012),
97- 205.

This page intentionally left blank

,,,,,.
– —,,,,

CHAPTER

• •

•• I soon felt that the forms of ordinary language were far too
diffuse . . . . I was not long in deciding that the most favorable path

to pursue was to have recourse to the language of signs. It then
became necessary to contrive a notation which ought, if possible,

to be at once simp le and expressive, easily understood at the com- ,,
mencement, and capable of being readily retained in the memory.

Charles Babbage
“On a Method of Expressing by Signs the Action of Machinery,” 1826

CHAPTER OUTLINE
9. 1 Introduction

9 .2 Speech Recognition

9 .3 Speech Production

9 .4 Human Language Technology

9.5 Traditional Command Languages

311

312 Chapter 9 Expressive Human and Command Languages

9 .1 Introduction

The dream of speaking to compu ters and having computers speak has long
lured researchers and visionaries. Arthur C. Clarke’s 1968 fantasy of the HAL
9000 computer in the book and movie 2001: A Space Odyssey has set the standard
for performance of computers in science fiction and for developers of natural
language sys tems . The reality is more complex and sometimes more frustrating
than the dream, but much-improved speech recognizers have now joined the
well-established speech telephone-ba sed menu applications to reach a wide
array of applications. Errors remain a significant challenge, and not all situa­
tions benefit enough from speech input to balance the cost of errors and the
frustration of error correction. Once the commands, questions, or statements
have been recognized, human language technologies may be needed to execute
the appropriate action, initiate a clarifying dialogue, or provide translations.

Some applications simu late natural language interaction. They require users
to speak a restricted set of the spoken commands that users have to learn and
remember. Similarly, some textual interaction systems rely on the availability of
vast text repositories that can be searched using standard search algorithms to
find answers to questions written in full sentences. Repositories of translated
text, such as the multiple language translations from the United Natio11s, can
also help make good-quality translations of words, snippets, or full sentences.

See also:

Chapter 14, Documentation and User Support (a.k.a. Help)

Chapter 15, Information Search

The use of command languages in the early days of computing (e.g., DOS or
Unix) receded with the advent of graphical user interfaces. However, command
languag es are stilJ widely used by expert users of specialized applications from
computer programmers to the millions of engineers and scientists using tools
like MATLAB®, which combine a command language and graphical environ­
ment. In fact, one could argue that the spread of speech interfaces is re-invigorating
the development of command languages as designers choose which combina­
tions of words will be recognized as commands in speech interfaces.

While understanding natural language remains an unattainable dream, there
are many applications that can successfully make use of the words people say,
type, or listen to (Box 9.1).

This chapter starts ‘”‘ith the rapidly growing speech interfaces (from speech
recognition in Section 9.2 to speech production in Section 9.3) and then discusses

9.2 Speech Recognition 313

BOX 9.1
Speech technolog ies.

• Store and replay (museum guides)

• Dictation (document preparat ion, web search)

• Close captioning, transcription

• Transactions over the phone

• Persona l “assistant” (common tasks on mobi le devices)

• Hands-free interaction with a device

• Adapt ive technology for users with disabilities

• Translation

• A lerts

• Speaker identification

human language technologies (Section 9.4) including trans lation educationa l
applications. Finally, Section 9.5 reviews the traditional, yet expressive, com­
mand language interfaces.

9 .2 Speech Recognition

Speech recognition has made significant progress in recent years (Huang et al.,
2014) and is now being used in a number of welJ-targeted knowledge domains
such as airline information, lost luggage, medical-record data entry, and persona l
digital assistants (Cohen 2004; Karat et al., 2012; Pieraccini, 2012; Bouzid and Ma,
2013; Neustein and Markowitz, 2013; Mariani et al., 2014). Driven by the diffi­
culty of typing while using mobile devices (phones or touch tablets), spoken
input has gained acceptabi lity. More users learn to use spoken commands such
as “Where is the closest coffee shop?” or “Tell John I will be late.” Discoverability
and leamability are often an issue, but commands can be spoken without looking
at the screen, while driving a car (equipped with a hands-free phone), or while hik­
ing 011 a bumpy trail. However, commands such as “Make space in my drive” are
still a great challenge and would require extensive dialog design (see Section 9.4
on human language technology). Improved recognition rates are making dicta­
tion and transcription possible, but error correction remains a challenge, and
most applications require users to learn and remember complex sets of com­
mands to accomplish their tasks. Background noise and variations in user speech
performance make the challenge of speech recognition still greater.

314 Chapter 9 Expressive Human and Command Languages

9.2.1 The place for spoken interaction
While speech recognition is used successfully in a growing number of applica­
tions, the vision of comp ut ers cha tting leisurely wi th users about var ied open­
ended topics remains more of a fantasy than a reality. Whi le HAL 9000 of 2001:
A Space Odyssey communicated with the ship crew mostly by voice, newer
science-fiction writers have shifted their scenarios, with reduced use of spoken
interactio n in favor of larger visua l displays and gestures, from Star Trek: Voyager
to Minority Report and Avatar or Mission Impossible 4. Voice interaction with emo­
tion-evoking robots remains a theme in movies such as Her and Ex Machina.

While early applications of speech recognition were mostly limited to dis­
crete-word recognition (with extensive training for the system to learn a par­
ticular user ‘s voice), the major breakthrough in the past decade has been the
impro vement of continuous-speech recognition algorithms and the avai lability
of very large repositories of voice data on the web, which can be ana lyzed to
train algorithms. The other significant advance that made speech recognition
possible on mobile devices is the ability to process the spoken input remotel y
and quickly enough for rapid interaction. Reduced training (or its elimination
with speaker-independent systems) has greatly expanded the scope of com­
mercial applications. Quiet environments, head-mounted high-quality micro­
phones, and careful choice of vocabu laries improve recognition rates in all
cases. Low-cost speech chips and compact microphones and speakers enable
designers to include speech systems in higll-volume products, such as dolls
and other toys.

Applications are successful when certain condi tions exist (see Box 9.2) and
when they serve users’ needs to work rapidly witl1 low cognitive load ai1d low
error rates. Even as technical problems are being solved and the recognition
rates are improving, spoken commands are more demanding of users’ working
memory than is hand/eye coordination and thu s may be more disruptive to
users while they are carrying out tasks . Speech requires use of limited resources,
while hand / eye coordination is processed elsewhere in the brain, enabling a
higher level of parallel processing. Planning and problem solving can proceed
in parallel with hand/eye coordination, but they are more difficult to accom­
plish while speaking (Radvansky and Ashcraf t, 2013). In shor t, speaking is more
demanding than many advocates of speech recognition report.

Early applications include systems for aircraft -engine inspectors, who wear
wireless microphones as they walk around the engine, their hand s busy open­
ing cover plates or adjusting componen ts. They can issue orders, read serial
numbers, or retrie ve previous maintenance records by Ltsing limited ,,ocabu ­
lary. As in all speech input systems, they can be disruptive to others who find
the noise a serious distraction.

The benefits of speec h recognition to peop le with physical or visua l disabili­
ties, even temporary ones, are rewarding to see (Fig. 9.1). Its va lue during mobile

9.2 Speech Recognition 315

BOX 9.2
Speech recognition and production : Opportunities and obstacles.

Opportunities

• When users have physical impairmen ts

• When the speaker’s hands are busy

• When mobility is required

• When the speaker’s eyes are occupied

• When harsh or cramped conditions preclude use of a keyboard

• When application domain vocabulary and tasks are limited

• When the user is unable to read or write (e.g., children)

Obstacles to speech recognition

• Interference from noisy environments and poor-quality microphones

• Commands need to be learned and remembered

• Recognition may be challenged by strong accents or unusual vocabulary

• Talking is not always acceptable (e.g., in shared office, during meetings)

• Error correction can be time-consuming

• Increased cognitive load compared wi t h typing or pointing

• Math or programming difficult without extreme customization

Obstacles to spe ech production

• Slow pace of speech output when compared with visual displays

• Ephemeral nature of speech

• Not socially acceptable in public spaces (also privacy issues)

• Difficulty in scanning /searching spoken messages

use can be sign ificant for users who take the time to learn and remember what
can be accom plished wi th spoken comm ands, but general users of office or per­
sonal comput ers are not rushing yet to adopt speech input and output devices .

9 .2.2 Speech recognition applications
For de signers of huma n-computer interaction sys tems, speech recogni tion tech­
nologies have man y var iations, which can also be combin ed pro du ctively with
speech pr odu ction (Li et al., 2015).

The goal of speech recogni tion is prim arily to produce text based on spoken input
(Lewis, 2011), the most straightforwa rd application being dictation. Dictation sys­
tems have now reac hed recogniti on rates that are accept able in many situations

316 Chapter 9 Expressive Human and Command Languages

FIGURE 9.1

♦ C I

RMnn-h “iumm:111′)’ : (200 1 _., – .,_ ._ , … ,,,._ ,. …,..
wor

Nw I “”‘ 0 ….. .,.,©
– \;J

Mobile dev ice assistants (from left to right: Siri, GoogleNow, Cortana, and Hound) all
have similar microphone buttons but different ways of presenting suggestions.

320 Chapter 9 Expressive Human and Command Languages

past decades, today’s speech recognition systems stil l degrade catastrophically
even when the deviations are small in the sense (that) the human listener exhibits
little or no difficulty. Robu stness of speech recognition remains a major research
challenge” (101). Finally, only a small portion of the myriad of world languages
have adequate recognizers, and the mixing of two or more languages in the same
sentence-which is common for multilingual speakers-also causes problems.

Early speech recognition systems were speaker dependent, ai1d users were
required to train tl1e sys tem to recognize their voice or deal with a particular
microphone. This is not the case anymore for mobile phone use but is encour­
aged for professional applications that incorporate some level of personaliza­
tion to increase the recognition rate. Changing microphones also required
recalibration. In all cases, limiting the world of po ssible commands and care­
fully selecting easily differentiated term s dramatically improve recognitio11.

Correcting errors Correcting errors can be very taxing, especially when users
do not have access to a keyboard or pointing device so all corrections have to be
done using speech, possibly compounding errors with new ones. Even wl1en a
keyboard and pointer is available, having to correct errors is a significant di s­
traction from the main task. A pause is generally required to separate dictatio11
from editing commands, but providing correction commands that are very dis­
tinct ‘”‘ill also facilitate their identification. Facilitating the erasing of last spoken
text (e.g., saying “scratch that”) allows repeating or rephrasing. Once a correc­
tion command ha s been identified (Fig. 9.3), alternative text can be proposed, or

EW IBM CONNECTIONS omcE REMOTE

Aa&bce1,, llollbC<Dd AaBbC AaBbCc AaBbC AaBbCcO AoBbClle £111… Em~ ~tl’die L -.

,.

an example for the innis lanauaae

“‘~•,f). lcon«t firw11~h

#t Ccnection Menu Lu

Select “‘Choose”‘ followed by a number
-Choose 1” finish
-Choose l’ RNNtSH

You can also se-lect
“SJ)ell that~ 11)’0Udon1ttt)’OUl’dlokt~
“tlay lhnt b;Kk”

• ~II caps that'”
•Add that to Y:ocobul;,,y”
-unse1ec1 thar
“Qon’t recogr.ze th~t word”
·Make thal a corrwnand”

9(:ooedion opllons”

FIGURE 9.3

Correcting a word during dictation using Nuance Dragon™. After saying “Correc t
finnish,” the word is selected and possible corrections are displayed in a menu
along with additional commands such as “spell that.” Users can use the cursor,
arrow keys, or voice to specify their choice.

9.2 Speech Recognition 321

users can add and record new terms (e.g., “IEEE” pronounced as “I triple E”) or
spell out words (e.g., for new names or cities).

Mapping to possible actions The secret of most successful speech recogni ­
tion applications today is that they are limited to narrow application domains ­
so the world of actions is Jjmited, and they use comma nd s carefulJy chosen to
increase recognition (e.g., using “scratch that” to delete text). Banking IVRs only
know about banking terms and have a small set of possible actions. Users of per­
sonal assistants on mobile devices may impress friends with the variety of pos­
sible commands, but each app has a limited set of possibilities. This stems from
two main causes: First, mobile applications designers by nature focus on a limited
number of often-used functions that are used constantly. Second, because speech
is a highly variable signal, large corpora of recorded speech matching the applica­
tion domain are needed to achieve good recognition results, so speech recogni­
tion achieves much better results in application domains that have been studied
and modeled extensively. Even if the speech recognition made no error , there are
many levels of possible errors mapping the corresponding text to the expected
action, as illustrated in Fig. 9.4. Companies continuously collect data from users
as they speak and correct errors to improve both the speech recognition and the
mapping to appropriate actions. Comparisons of today’s assistants such as Siri,
Google Now, Cortana, and Hound seem to suggest that mapping the recognized
text to the most appropriate action is the most challenging task (Ezzohari, 2015).

Go gle glacier national Park

Aboul 19 400.000 1c-u~ (06 1 “;Oeonlk)

Glacier National Par1< U.S. National Park Service
WWW.11)$.gov/glacl G]DieLMion Bca:(Ol&des natiorul P.ll\ . Goor;ile $Nr(h + Molilll fireL
Rel/Ye the Clays OI
Nabve Am$ncans rMJ rll"!&I"'-• ~ i.~ 1o ,-,. RIii T!'II ~ol : ,11 of o.-~-. ~ ,~llletl

Webcams
Poolo Oenery •

In the news

F GURE 9.4

to ~ Old td.,ro,
NoJla: b ~t,,,ao,,4',-U151»nn.-. ~~ ltt.-W Clllleot• llleo,c.nl , ,..,
Doll web ~h fol' gl;,(.M:lf' Mllonal ~ rt

~MY tllal aQtifll

II m MIN

··-
\ -~ ~:. ~t::·• – .. -ill. I~,·- ·• .-. …. . ,. –

Glacier Nationa l
Park

Glltcier N11h:1nnl Pttrk 15 11 nntionlll p• rt locntOO 1n lie US
5181& Of t.lOtllana, on too caooda-U lliled States border
.. Ille C8M<1Bn f)(OYJ'lCes Of AIIMl'I and BrliSh
Coll.-nblll ~ IA

Address: w~ GltM:IOf, MT

0

It can be difficul t to remember what exact command will accompl ish the task. In
t his examp le, when t he user said, "Search the web for glacier national park," a
Goog le search was launched and a search executed as expected, but when the user
said, "Do a web search for g lacier national park," all the words were accurately
recognized but not as a command, so the text was placed in the Nuance Dragon TM

dictation box.

322 Chapter 9 Expressive Human and Command Languages

Being ab le to rely on contextua l information such as location or text from pre­
vious commands gives the impression of a more conversational interaction. For
example, it might be possible to say "show me close by restaurants," "how about
in Baltimore instead," "with 3 or more star reviews." Those chains of commands
are significantly more difficult to interpret correctly and are today only achieved
in constrained applications and by trained users who have learned what will be
successful.

Feedback and dialogues During dictation or transcription, the recogni zed
text is shown in the document being composed or in a dictation buffer, usu­
ally after a short delay (one to two seconds). Users can continue speaking or
start correcting errors with the keyboard or by speaking navigation or editing
commands. After correction, the text can also be transferred to a search box, the
body of an e-mail message, or a field in a form. Applications tightly integrated
with the speech recognition (opposed to relying on a dictation buffer) are more
likely to be attractive and can generate spoken feedback as well.

Commands are usually executed directly, unless confirmation is preferable
(e.g., "I am ready to e-mail this to Ben Shneiderman, should I go ahead?") or
additional information or disambiguation is needed (e.g., "There are 2 Joht1
Smiths in your address book, which one should the e-mail be sent to?''). When
context information has been used, feedback indicates how it was used.
Specific questions may be asked to fill the holes in the task model and its attri­
butes; for example, saying "Set an alarm" triggers a response asking "Set ai1
alarm for wl1en today?" (i.e., the date and time are missing from the alarm-setting
task model, today was selected as the default date attribute, and a time attribute
is still needed).

The availability of a display can greatly speed up interaction by presenting
the proposed action in detail and only asking users to co11firm or cancel, but it
precludes eye-free operation (e.g., potentially endangering drivers). On the
other hand, entirely spoken dialogues can be lengthy and even reveal informa­
tion the user didn't want to be heard.

9.2.4 Spoken prompts and commands

When human language techno logy has been identified as appropriate for an
application, prompts and commands resembling natural languages have to
be designed. A language may have a simple or complex syntax and may have
a few operatio11s or hundreds, but the key issue – and the main usability
determinant-is to adequately design clear prompts and a set of commands
users can speak comfortably and remember easily and the system can
recognize reliably.

The choice to use speech instead of keyboard entry is primarily a matter of
user choice or possibility, but even with speech designers are the ones wl10
decide what features to support, what commands ,-vill be used, how users will

9.2 Speech Recognition 323

discover what is possible, and what feedback or error messages will be
provided.

The designer's first step is to study the users' task domain. The outcome is a
list of task actions and objects, which is tl1en abstracted into a set of interface
actions and objects. These items, in turn, can be represented with the low -level
interface syntax. Observing users speaking aloud is critical to discover com­
mands tl1at users might speak "naturally." Both commands and prompts may
include terms tl1at are rarely used in direct manipulation or menu systems; for
example, users are likely to say "set an appointment for tomorrow " even though
no specific menu for "tomorrow" exists in the menus of the graphical calendar
interfaces.

A typ ical form is a verb followed by a noun object with qualifiers or arg11-
ments for the verb or 11oun- for example, user s might say "lam1ch Facebook"
or "set an alarm for 7 a.m." Human learning is greatly facilitated by meaningful
structure. If a set of commands is well -designed , users will recognize its struc­
tur e and easily encode it in their semantic -knowle dg e storage. For exampl e, if
users can uniformly edit words, sentences, and document s, this meaningful
pattern is easy for them to learn , apply, and recall. On the other hand, if they
must use different terms to change a word, revise a sentence, or alter a docu­
ment, the challenge and po tential for error grow substantially, no matt er how
elega nt the syntax is . The "naturalness" will result from careful de sign and
inclusion of synonyms (Fig. 9.5).

An effective way to test ear ly versions of a spoken language interaction is to
conduct a Wizard of Oz evaluation in which a hidd en person is tran scri bing the
spok en commands into text to simulate perfect recogn ition and typing dialog
prompts that are shown to the unsuspecting participant on a screen (for an
example, see Dyke et al., 2013).

give me help

give me help on commands

I ( go I move) I ( ( ( back I backward I backwards) I ( forward I forwards) ) I ( up I down)) ( one I a ) line

I ( go I move ) I ( ( ( back I backward I backwards) I ( forward I forwards )) I ( up I down)) ( twenty [ … ) lines

( go I move) … I ( ( one I one) I ( twenty I … ) ) I
[ ( go I move) I ( ( left I right ) I ( ( back I backward I backwards ) I ( forward I forwards) ) ) ( one I a ) character

[ ( go I move ) ) ( ( left I right) I ( ( back I backward I backwards ) I ( forward I forwards))) ( twenty [ … ) characters

( go I move) to I the I ( bottom I end)

( go I move) to I the J ( bottom I end) of I the I ( line I document)

( go I move) to I the I ( start I top I b~inning )

( go I move) to I the I ( start I top I b~inning ) of I the I ( line I document)

goto sleep
go_to_sleep

help me

FIGURE 9.5
A small subset of the rich set of commands used in the Nuance Dragon TM speech
recognition system. Synonyms are inc luded and used consistently.

324 Chapter 9 Expressive Human and Command Languages

9 .3 Speech Production

Speech production is usually successful when the messages are simple and shor t
and users' visual channels are overloaded; when they must be free to move
around or on the phone; or when the environment is too brightly lit, too poorly
lit, subjec t to severe vibration, or otherwise unsuitable for vis ual displays.
However, designers must cope with the four obstacles to speech output: the
slow pace of speech output when compared with visual displays, the ephemeral
nature of speech, acceptability and privacy issues in public spaces, and the dif­
ficulty in scanning/ searching (Box 9.2).

There are three general methods to produce speech. A common type of
speech generation available commercially is for1nant synthesis, which produces
entirely machine -generated speech using a set of algorithms to product sounds
based on the phonetic representation of the text. The speech sounds somewha t
artificial and robot-like. Concatenated synthesis instead combines tiny recorded
human speech segments into phonemes, words, and phrases into full sentences.
The voice is more natural but requires significantly more storage and comput ­
ing power to assemble sentences on the fly. Formant synthes is and concatenated
synthesis can generate any sente nce as needed. Finally, canned speech consists of
a fixed set of digitized speech segments which can be assemb led together to cre­
ate longer segments (e.g., "The next bus will arrive in" followed by "11" then
"minutes"), but the number of possible complete sentences is limited and the
seams between segments may sound awkward.

The quality of generated speech can be eva luated in terms of understandabil­
ity, naturalness and acceptability. For some app lications, a computer- like sound
may be preferred. For example, the robot-like sounds used in the Atlanta airport
subway drew more attention than did a tape recording of a human giving direc­
tions. Interacti ve voice response sys tems (IVRs) typical ly mix canned speecl-l
segments and speech synthesis to allow appropriate emotional tone and current
information presentation.

Audio books or audio tours in museums and tourist sites also use canned
speech. They are success ful because they allow users to control the pace while
conveying the curator's enthusiasm or author's emotion. Educational psycholo­
gists conjecture that if several senses (sight, touch, hearing) are engaged, learn­
ing can be facilitated. Adding a spoken component to an instructional system or
an online help system (Section 14.3.2) may also improve the learning process.

Alerts and warnings can be presented using speech . They have been used it1
automobile navigation systems ("Turn right onto route Ml"), internet services
("You've got mail"), or utility-control rooms ("Danger, temperature rising"),
but in most cases, the novelty wears off quickly. Talking sup ermarke t checkout
machines that read out products and prices were found to violate sl1oppers'
sense of privacy about purchases. Only generic instructions are spoken now, but

9.4 Human LanguageTechnology 325

many consumers still find them too noisy. Similarly, annoying warnings from
cameras ("Too dark-use flash") and automobiles ("Your door is ajar") were
removed and replaced with gentler tones or visual indicators. Spoken warnings
in cockpi ts and control rooms are still used because they are omnidirectional
and elicit rapid responses. However, even in the se environments, spoken warn­
ings can be missed, especially when in competition with human-human com­
munication, and multiple methods are used simultaneously (e.g., a visual alert
or a dialog box).

Applications for the visua lly impaired are an important success sto ry . Utili­
ties like the built-in Microsoft Windows Narrator or Apple VoiceOver can be
used to read passages of text or hear descriptions of items on the screen. Screen
readers like Freedom Scientific's JAWS, NV Access's Non Visual Desktop Access
(NVDA), or Apple VoiceOver allow users with visual impairments to produc­
tively navigate between windows, select applications, browse graphica l inter­
faces, and of course read text. Such tools rely on textual descriptions being made
available for visual elemen ts (labels for icons and image descriptions for graph­
ics). Reading speed is adjustable, which allows interaction to be speeded up as
well when needed. Book readers are also wide ly used in libraries. Patrons can
place a book on a copier-like device that scans the text and does an acceptable
job of reading it.

The slow pace of normal spoken output, the ephemera l nature of speech, and
the difficulty in scanning/searching remain challenges, but speech production
is widely used because it enables services that would otherwise be too expen ­
sive; hiring well-trained customer-service representatives available 24 hours a
day is not practical for many organizations.

9 .4 Human Language Technology

Even before there were co1nputers, people dreamed about creating machines that
would be able to understand natural language-that is, be able to take the appro­
priate action in various contexts without users having to learn any command syn­
tax or select from menus. It is a wonderful fantasy, but language is subtle; there
are many special cases, contexts are complex, and emotional relationships ha ve a
powerful and pervasive effect in human-human communication. Although true
comprehension and generation of ope11-ended language seem an inaccessible
goal, there has been extensive research on human language technology; wide­
spread use is slow in developing, primarily because the alternatives are more
appealing. Contrary to common belief, human-human interaction is not necessar­
ily an appropriate model for human operation of computers. Since computers can
display information 1,000 times faster than people can enter commai1ds, it is
advantageous to use the computer to displa y large amounts of information and to

326 Chapter 9 Expressive Human and Command Languages

allow users simply to choose among the items. Selection helps to guide users by
making clear what objects and actions are available. For know ledgeable and fre­
quent users who are thoroughly aware of the available functions, a precise, con­
cise language (typed or spoken) is usually preferred (Section 9.5).

Natur al language interaction (NLI) in the form of a series of exchanges
resembling a dialogue is difficult to design and build for even a single topic. The
key impediment is the habitabilihJ of the user interface-that is, how easy it is for
users to determine wha t objects and actions are appropriate. Visual interfaces
provide the cues for the semantics of interaction, but NLI interfaces typically
depend on assumed user models. Users who are knowledgeable about their
tasks-for example, stock-market brokers who know the stock codes (objects)
and buy /sell actions -c an place orders in natural language, but these users pre­
fer compact command lai1guages because they are more rapid and reliable.

While early conceptions of human lan guage technology assumed that com­
puters would parse natural language expressions in text or spoken forms a1ld
derive some level of "understanding" and description of users' "intent," the
current successes rely it1stead on statistical m.ethods based on the analysis of vas t
textual or spoken corpora and usage data of millions of users.

For example, question-answering strategies are successful in situations where
there are rele, ,ant corpora and designers have craf ted effective user interfaces
that expand queries, search databases, show users alternatives, and present final
resu lts in ways that are most likely to be useful. Their success comes not from
the understanding of the natural lan guage but from the fact that the question at
hand has already been asked before – using the same terminology – and has
already been answered by others (Hearst, 2011). Another method is to analyze
web search usage logs to find what resu lts users seek often. For example, when
users type "Leddo restaurant," hLunan language technology extracts relevan t
queries from the dataset and identifies that "Leddo" does not exist but "Ledo" is
a frequent entr y. Then the word "restaurant" has been repeatedly identified as a
term that leads users to look for an address, hours of operation, or a map, so that
information can be presented by default. This can be done on the basis of fre­
quency of past queries and on the log of previous users' actions.

Other applications include extraction and tagging. Extractio n refers to the pro­
cess of analyzing human language to create a more structured format, such as a
relational database. The advantage is that the parsing can be done once in
advance to structure the entire database and to speed searches when users pose
relational queries. Legal (Supreme Court decisions or state laws), medical
(scientific journal articles or patient histories), and journalistic (Associated Press
news stories or Wall Street Journal reports) texts have been used. A variant is to
tag documents based on content. For example, it is useful to have an automated
analysis of business news stories to classify them as covering mergers,
bankruptcies, or initial public offerings for companies in vario us industries sucl1
as electronics, pharmaceutical, or petroleum. Extracting and tagging applica­
tions are promising because users appreciate even a modest increase in suitable

9.4 Human LanguageTechnology 327

retrievals, and imperfect retrievals are more acceptable than errors in natura l
language interaction. On the other hand, errors can become quite problematic
when the extracted information is used to make decisions or inform policies.
One example is the use of human language technology in medicine. A large
amount of information about medical conditions, treatments, and outcomes is
buried in textual notes \"lritten by physicians in electronic health records. Auto­
matically extracting diagnoses or test results out of the text notes can be very
useful to identify possible candidates for a clinical trial, as all records will be
reviewed by a clinician. On the other hand, the use of automatic tags for clinical
decision making can be problematic. The rare cases of success are limited to situ­
ations with specific users, document types, and decision support goals (Demner­
Fushman et al., 2009). Sentiment analysis is a specializ ed tagging, which can be
applied to groups of news articles, reviews, or soc ial media to monitor global
changes in opinion, but tagging of individual documents remains error -prone.

Human language text generation is used for simple tasks, such as the prepara­
tion of structur ed weather report s ("80% chance of light rain in north ern suburbs
by late Sunday afternoon") in which generated reports from structured databases
can be sent out automatically. Automatical ly generated text can be used to supple ­
ment standard data charts such as bar charts or scatterplots in order to make them
mor e accessible to us ers with visual impairm ents (e.g., Google Sheet's Explore or
iweave .com). More elaborat e applications of text generation includ e pr eparation
of reports of medical laboratory or psycho logical tests. The computer generates
not only readable reports ("White -blood -cell count is 12,000") but also warnings
("This value exceeds the normal range of 3,000 to 8,000 by 50%") or recommenda­
tions ("Further examination for systemic infection is recommended"). Still mor e
in, ,olved scenarios for text generation involve the creation of legal contracts, wills,
or business proposa ls. Text summarization remains a much greater challenge
with limited success, as summaries must capture the essence of the content and
convey it accurately in a compact manner (Liu et al., 2012).

Human language technologies are used in instructional systems. Successfu l
examp les are in grammatical error detect ion and proofreading. Also widely
used-but more controversial-is the automated scoring of short-answer
responses or essays during student assessment. Human language technology
has been introduced into a variety of educational contexts such as reading sup­
port. Tutoria ls with materia ls and pedagogy that have been carefu lly tested can
provide feedback in natural language, which encourages students to stay
engaged in the educational process. Simulations can also be used to practice
communication skills learned in other settings (Fig. 9.6).

A remaining question is whether learning differs when students speak their
responses or type them. A Wizard of Oz experiment (where a human transcribed
the learner's speech before submitting it to the tutor) suggests that learning
gains and preferences are similar with both modalities, but highly motivated
students reported lower cognitive load and demonstrated increased leanung
when typing compared with speaking (D'Mello et al., 2011).

328 Chapter 9 Expressive Human and Command Languages

Translation between human lan­
guages has long been a goal (Green et
al., 2015), but older strategie s of word
replacement witl1 some grammatical
parsing have given way to statistical
methods based on having large data­
bases with correct human translations ,
such as United Nation s document s
that appear in five required language s.
Then well -designed user interfaces
clarify what users can input in text
window s, present translation option s,
show the translation, and guide sub se­
quent user actions (Fig. 9.7). This
design effort gets more complex with
inpt1t errors and languag es that may
have unfamiliar characters, differ from
English left-to-right formatting, and
invoke words that do not exist in the
targ et languag es.

Google

Translate

FIGURE 9.6
Using the lmmersive Naval Officer
Training System (INOTS), new Navy
off icers can practice their counseling
skills in a virtual reality environment.
Officers listen to an avatar and
respond using spoken language,
loosely fo llowing suggestions from
multi-choice prompts presented on
the screen and designed to match the
learning objectives. The interaction
is constrained, but assessment is
faci litated (Dyke et al., 2013; http://
www.netc.navy.mil/nstc/news _
page_2012_02_24_2.asp).

Catherine ::: 0 •

English Spani sh FrencJI Doted laneu11go • .. ~ trre!Mlh En9lil h Spanl1h • SH:! di
Dur de traduire ces drOles de phrases

<frOles ◄~

Definitions of drOle _ …
Arnusant, comique.
"Ce com8dien e61 lrits dr61e .-

Bizarre.
"C'est drOle, on n'a pas entendU parler de lul depuls longlemps ."

See also
hlstolre drOle, C'e, t drtile.

x Hard to translate [these fujnny[_ sentences

Translations of dr61e

adjective

theee bl ny

–· ~ C88 ..,.._
thOse b'l ny …… .._.

r
W· 1'4 i!M'ft!rW!=-51

fileb

Command-line interfaces are often preferred when the application is used in
an advanced way (e.g., professionals using an application for hours every day).
Casua l users favor graphical user interfaces, but both styles of interface can be
made available successfully because they do not always provide the same
functionality. For example, in MATLAB, the command language can handle all
the calculatio11s, and a large subset of calct1lations is also available via the graph­
ical user interface, which makes it easier for novice user s to get started . Being
able to type complex Boolean expressions using AND, OR, or NOT as well as
regular expressions remains a key motivation for experienced users who can
accomplish remarkable feats at amazing speed (Fig. 9.8).

Web addresses or URLs can be seen as a form of command language. Users
come to memorize the structure of their favorite site addresses, even though the
typical usage is to click on a link to select an address from a webpage or a search
result page. The address field of browsers can also be used as a command line.
For example, typing “(1024*768)/25” in the URL field in a Chrome browser will
calculate the result, and typing “100 feet to meters” will launch the conversion
tool and show the result: 30.48 meters.

330 Chapter 9 Expressive Human and Command Languages

•••
1 XDDT_0000c7c
2 XDDT_0000c7q
3 XDDT_0000c7c
4 XDOT_0000c7c
5 XDOT_0000c7c
6 XDOT_0000c7c
7 XDOT_0000c7c
8 XDOT_0000c7<!
9 XDOT_0000c7c

10 XDOT_0000c7c
11 XDOT_0000c7c
12 XDOT_0000c7
13 XDOT_0000c7
14 XDOT_0000c7c
15 XDOT_0000c7c
16 XDOT_000141e
17 XDOT_000141•
18 XDOT_000141q
19 XDOT_000141e
20 XDOT_000141e

FIGURE 9.8

Incident start 2013-07-20 21:30:46,000
State Po(l c• I arrived 2013-07- 20 21:31:31,000
SHA Shop Churchville notified 2013~7-20 21:31:45.000
Fireboard arrived 2013-07-20 22:06 :40.000
Investigation-accident notified 2013~7-20 22:39:58.000
H•dical Examin•r notifi•d 2013-07-20 22:40:08.000
Priv. Tow Light Duty notified 2013~7-20 22:40:22.000
local Police ! arrived 2013-07-20 22:40:38.000
Unit not tied 2013-07-20 22:45:17.000
Unit notified 2013-07-20 22:48:06.000
Fireboard departed 2013-07-21 01:55 :55.000
loc a Po ice departed 2013-07-21 01:55:56.000
State Po ce departed 2013-07- 21 01:55:58,000
Inc ent c eared 2013-07-21 01:56:29.000
Inciden t closed 2013-07-21 01:56:32.000
Incident start 2011-03-05 21:22:33,000
Fi reboard arriv•d 2011-03~5 21:23 :12.000
Local Police) arr ived 2011-03~5 21:23:15.000
CITY PD notified 2011-03~5 21:23:22.000
CITY PD arrived 2011-03-05 21:23:22,000

t .• 7 Police

Using the Sublime text editor, a user is doing a search and replace in a data t able
using regular expressions. Typing" \t.*? Pol ice" in the search box searches for a
tab fo llowed by zero or more characters, a space, and then the word "Po lice." The
patterns found in the document are highlighted with a thin black line, showing that
both "Local Police" and "State Police" have been found and selected. An overview
of the entire document is visible on the right, revealing the presence of many othe r
matches that can now be replaced all at once.

Twitter tags (#hcil, $TWTR, or @benbendc) can also be considered an exam ­
ple of new command language that needs to be learned and remembered, along
with the proliferation of acronyms and abbreviations used by clever text­
message writers (e.g., LOL for "laugh out loud" or 2G2BT for "too good to be
true"). In the traditional desktop environment, shortcut keys also remain heav ­
ily used by users who take the time to learn them (e.g., typing Ctrl-Q for Quit or
Ctrl-P for Print; see Section 8.2.2). Programmers or professionals who use a
single app lication all day long (e.g., a computer-aided design or pub lishing
application) can memorize hundreds of commands and shortcuts, helping them
gain mastery of their application (Cockburn et al., 2014).

One important opportun ity linked to command languages is that histories
can easily be kept and macros or scripts created to automate actions, but the
essence of command languages is that they have an ephemera l natu re and they
produce an immediate result on some object of interest. Feedback is generated
for correct commands, and error messages (Section 12.7) result from unaccept­
able forms or typos. Auto-completion is critical to help prevent errors.
Command -language systems may offer brief prompts with choices, becoming
closer to menu-selection systems. Command languages typically do not require
a pointing device and therefore can become a lifesaver for users with visual
impairments, which make the use of mice and touchscreens impractical.

Practitioner's Summary 331

Database-query languages for relational databases were developed in the
middle to late 1970s; they led to the still widely used Structured Query
Language, or SQLTM, which emphasized short segments of code (2 to 20 lines)
tha t could be typed and execu ted immediately. For examp le:

SELECT• FROM Products
WHERE Price BETWEEN 10 AND 20;

Here the goal of the user is to create a result rather than a program. A key part
of database-query languages and information-retrieval languages is the specifi­
cation of Boolean operations – AND, OR, and NOT- which can be very chal­
lenging to specify. See Chapter 15 for more on advances regarding searching.

Major considerations for expert users are the possibilities of tailoring the lan ­
guage to suit personal work styles and of creating named macros to permit sev­
eral operations to be carried out with a single command. Macro facilities allow
extensions that the designe rs did not foresee or that are beneficial to only a smal l
fragment of the user community. A macro facility can become a full program ­
ming language that might include specification of arguments, conditionals, iter­
ation, integers, strings, and screen-manipulation prim itives plus library and
editing too ls-resembling a full-blown programming language.

In summary, while error rates remain high, the complexity and power of
command languages have a certain attraction for a portion of the computer user
community. Users gain satisfaction in overcoming the difficulties and becoming
one of the inner circle "gurus" of their favorite command language.

Practitioner's Summary

The dream of natural language interaction has been mostly replaced by the effec­
tive use of statistical methods based on very large spoken and text corpora and
logs of user interactions. Speech recognition for personal digita l assistants and
dictation has become increasingly successful, but errors and error correction
remain issues. Speech-based approaches for guided interactions over telephones
are also proving to be useful.

Speech generation, when well-designed, can support effective applications with
phone, mobile devices, or book readers. Well-designed user interfaces enable inte­
gration of visual displays and touchscreens with speech. Text analysis, generation,
and translation are useful human language technologies based on large training
databases and appropriate user interfaces to prompt users and handle interactions.

Command languages continue to be attractive for expert users who learn
the semantics and syntax because they can rapidly specify actions involving

332 Chapter 9 Expressive Human and Command Languages

multip le options. Command languages allow sequences of commands to be
stored for future use as a macro or script.

For command languages as well as spoken command languages, designers
begin with a careful task analysis to determirle what functions should be pro­
vided. Meaningful specific names aid learning and retention.

Researcher's Agenda

Speech recognition and generation user interfaces are maturing rapidly as effec­
tive designs have a growing user community. Improved user interfaces that
integrate speech with visual displays and touchscreen contro ls may attract still
larger communities; however, research on error reduction and methods to facili­
tate error correction is still needed.

Natural language interaction success stories are still elusive, but human lan­
guage technology has become an important part of the success of search tech­
nology (Chapter 15). Spoken and text generation has shown value, so further
research is v.rarranted. For those who continue to explore specific applications,
empirical tests and long-term case studies offer successful strategies to identify
the appropriate niches and designs.

WORLD WIDE WEB RESOURCES

www.pearsonglobaleditions.com/shneiderman

• Designers will find many demonstrations of spoken interaction
on You Tube. For example, the different styles of feedback and
dialogue used by personal "assistants" can be seen at Hound Beta
vs . Siri vs . Google Now vs . Cortana: https://www .youtube .com/

watch ?t=134&v=9zNh8kOLhfo.
• Experimenting with common search engines and personal digital

assistants such as Siri or Google Now provides hints about the current
human lang uage techno logy strateg ies used for question answering .

• T ranslation: http://translate.goog le.com or http://www.babelfish.com
• Speech recognition commercial systems: http://www.nuance.com/dragon

• IVR dialog system: http://www .ibm .com/smarterplanet/us/en/
ibmwatson/developercloud/dialog .html

References 333

Discussion Questions

1. Consider voice-activated digital assistants such as Siri, Cortana, or Google
Talk. Identify a situation or scenario where you chose to use this personal as­
sistant, and identify a scenario where you chose to avoid it.

2. As a follow-on to the previous question, produce a thoughtful argument
about what role spoken interaction should have in user interfaces. Be sure to
list at least three benefits and limitations of spoken interaction.

3. Briefly describe the applications of speech recognition.

4. \i\lhat are the obstacles to speech recognition and production?

5. There exist applications of human language understanding technology. Name
some examples.

6. List several situations when command languages can be attractive for users.

References

Bou zid, Ahmed, and Ma, Wei ye, Don't Make Me Tap! A Co1nn1on Sense Approach to Voice
Usability, Dakota Press (2013) .

Cockburn, A., Guhvin, C., Scarr, J., and Malacria, S., Supporting novice to expert transi­
tion s in user interfaces, ACM Con·1puting Suroei;s 47, 2 (2014), Article 2.

Cohen, M. H ., Giangola, J.P., and Balogh, J., Voice User Inte1face Design, Addison
Wesley (2004).

Cross, J., A list of all the Google Now voice commands, Greenbot blog http:/ /www.
greenbot .com/ article /2359684 /system-software/ a-list -of-all -the -ok-google -voice ­
com ma nds.h tm (2015).

Demner-Fushman, 0., Chapman , W. Vv., and McDonald, C. J., What can natural
language processing do for clinical decision support? Journal of Bivn1edical Informatics
42, 5 (2009), 760-772.

D'Mello, S. K., Dowell, N. N., and Graesser, A., Does it really matter ,-vhether students'
contributions are spoken versus typed in an intelligent tutoring system with natural
language? Journal of Experirnental Psychology 17, l (2011), 1-17 .

Dyke, G., Adamson, A., Howley, I., and Rose, C. P., Enhancing scientific reasoning and
discussion ,-vith conversational agents, IEEE Transactions on Learning Technologies 6, 3
(2013), 240-247.

Ezzohari, H ., [ULTIMATE ] personal assistant review: Hound vs Siri vs Google Now vs
Cortana http:// ,-vww. typhone.nl/blog/ ultimate-voice-assistant-review/ (2015).

Green, S., Heer, J., and Manning, C. D., Natural language translation at the intersection
of AI and HCI, Co1nn1unications of the ACM 58, 9 (2015), 46-53.

334 Chapter 9 Express ive Human and Command Languages

Hearst, M. A., "Natural" search user interfaces, Connnunications of the ACM 54, 11 (2011),
60- 67.

Huang, X., Baker, J., and Reddy, R., A histor ical perspective of speech recognition,
Con1n1unications of the ACM 57, 1 (2014), 94- 103.

Karat, M-C., Lau, J., Steward, 0., and Yankelovich, N., Speech and language interfaces,
applications and technologies, in Jacko, J. (Editor), The Human-Computer Interaction
Handbook, CRC Press (2012), 367-386.

Lewis, J. R., Practical Speech User Interface Design, CRC Press (2011) .

Li, Jinyu, Deng, Li, Haeb-Umbach, Reinhold, and Gong, Yifang, Robust Speech Recogni­
tion: A Bridge to Practical Applications, Academ ic Press (2015).

Liu, S., Zhou, M. X., Pan, S., Song, Y., Qian, W., Cai, W., and Lian, X., TIARA:
Interactive, topic -dased vis ual text summari za tion and ana lys is, ACM Transactions on
Intelligent Systenzs Technology 3, 2 (2012), 28 pages.

Mariani, Joseph, Rosset, Sophie, Garnier-Rizet, and Devillers, Laurence (Editors),
Natural in teraction ivith Robots, Kno1vbots and Snurrtphones: Putting Spoken Dialog
Syste,ns into Practice, Springer (2014).

Neustein, A., and Markowitz, J. A. (Editors), Mobile Speech and Advanced Natural
Language Solutions, Springer (2013) .

Oviatt, Sharon, and Cohen, Philip, The Paradign·1 Shift to Multimodality in Contemporary
Computer Interfaces, Morgan & Claypool (2015) .

Pieraccini, Roberto, The Voice in the Machine: Building Co,nputers that Understand Speech,
MIT Press (2012).

Radvansky, G., and Ashcraft, M., Cognition, 6th Edition, Pearson (2013) .

Expert paper writers are just a few clicks away

Place an order in 3 easy steps. Takes less than 5 mins.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00