A layperson’s exploration of all things voice

Monthly Archives: May 2019

May 8, 2019

Amazon’s Annual Awards for Skills

This blog lists the new winners of Amazon’s “Webby Awards” for skills. The description for many of the winners includes the term “simple” – which is an indicator of what works well in voice (at least right now).

Since I’ve heard of many of the skills that won the awards – that means two things: I know more than I think about voice and we are still in the early days of voice. I’ll be trying out the skills that I haven’t heard of – and report back soon…

May 7, 2019

Do I Love My Echo Dot Too Much? The Risks of Anthropomorphism

“Anthropomorphism” is defined as the attribution of human characteristics or behavior to a god, animal or object. Think the movie “Her” starring Joaquin Phoenix.
As our voice assistants become more human-like, will they meet our social needs? And if so, at what cost? This article describes a study that tackled this important topic. Here’s an excerpt:

The feeling of exclusion was created during the studies in a variety of ways, including having participants write about an important time they were excluded (“My date stood me up for prom”) and an online game of catch in which the ball stops being tossed to the participants after a few initial tosses. The participants were then given the opportunity to interact with anthropomorphic products such as a Roomba, whose design made it seem like it was smiling and one’s own cell phone.

“People often name their cars or treat their Roomba like it is a pet, even referring to the vacuum as a ‘him’ or ‘he,’ said Mourey, assistant professor of marketing at DePaul University. “What we find is that these anthropomorphic products can fulfill social assurance needs in the way that genuine, interpersonal interaction often does. But there are limits.” Although the research shows that anthropomorphic products can fulfill social assurance needs, simply reminding individuals that these products are not actually alive makes the effect go away.

The findings have important implications for product design and interactivity, particularly in a time of increasing anthropomorphization of consumer products. Although consumers appreciate the ability to interact with their products as if the products were alive, they should know that this kind of interaction may thwart their motivation to engage with real others, the researchers say. This is particularly relevant in light of the increasing levels of reported loneliness.

Product designers may want to consider the potential benefits and harmful consequences of making consumer products, or avatars in service-oriented industries, that more closely emulate human interaction, they say. “Right now, there is a limit to the extent to which anthropomorphic products can fulfill social needs, but it is possible that this limit will no longer apply the more realistic and engaging consumer products become,” said Olson, assistant professor of marketing at the University of Kansas.

Knowing that anthropomorphic products and humans can both affect social needs, there may be possibilities to design products that increase the well-being of lonely individuals or that complement human interaction—say anthropomorphic health monitors and real nurses in the case of hospital care—to glean the benefits of such products without detrimental consequences on important, genuine interpersonal interaction.

May 6, 2019

Some Characteristics of Voice’s “Power Users”

At last week’s “SpeechTek Conference,” I really enjoyed a presentation by Versay’s Crispin Reedy – and I’ll blog about the topic she covered in the near future. But meanwhile, I found this 27-minute video in which Crispin goes over the results of a small study about voice power users. Her study included just 14 “power user” participants in an effort to get really in-depth into how people truly are using voice. In comparison, check out this much larger PwC survey about how consumers are using voice assistants – the participants in that study included many “non-power user” participants.

It’s interesting to hear about a case study with such a small sample size because they were really able to dig into how people are using voice assistants today. For example, at the 13-minute mark, Crispin drills down into some folks that bought a Google Home when they already had an Amazon Echo – she really gets into the specifics. It’s this type of usability study that provides good fodder for new ideas about to improve the voice experience.

The study’s conclusions are covered at the 21-minute mark. Essentially, the results were:

– Most users don’t care that someone’s listening

– Some were hesitant about using credit cards, cameras in the home or smart locks by voice(but not all)

– Mixed results about whether their voice assistant has a “personality” (half said ‘yes,’ half said ‘no’ – and one was resistant)

May 2, 2019

How China’s Culture Bodes Well for Their AI Initiatives

At this week’s “SpeechTEK Conference,” RAIN’s Will Hall gave a fabulous presentation about how China is embracing artificial intelligence – and how their culture is more accepting of the societal changes that will emerge. As noted in this article, since mid-2017, China has set ambitious goals to leapfrog the US as the global leader in AI by 2030. In China, people are optimistic about how AI will improve their lives. In the US, we tend to be skeptical – and the media pushes fearmongering on this topic.

This 14-minute video provides the gist of Will’s talk. It describes the “go heavy” model of China’s entrepreneurship – which ties into Will’s #1 rule of thumb: “To win in voice, you have to think in systems.” Definitely check out the video – it’s entertaining & enlightening…

May 1, 2019

How to Improve the “Discoverability” of Voice

At this week’s “SpeechTEK Conference,” Bruce Balentine of Enterprise Integration Group gave an excellent presentation on “discoverability” for voice. This interview gives a sense of what Bruce talked about:

Q: Why is it difficult for users to discover functions and operations that they can perform using voice applications?

A: Users discover functions and operations in a GUI interface by freely exploring, because a GUI utilizes the sense of sight and exists within the three dimensions of space. This is less effective in a VUI, because a VUI utilizes the sense of hearing and exists within the single dimension of time. Users therefore easily become lost, and the passage of time extracts a higher penalty in terms of thinking, confusion, inability to return to known starting places, loss of context, and risk of sudden dialogue terminations.

Q: How can users apply what they know about current voice applications when using new voice applications?

A: Users generally cannot apply what they know about current voice applications when using new voice applications—a phenomenon known as transfer of learning. This is partly because of a lack of standards, which product designers eschew in favor of differentiation for the sake of “branding.”

It is also because the industry has ranked very-large-vocabulary freeform “natural language” over such ergonomic issues as error-detection and recovery, fixed and learnable methods for backing up or skipping forward, consistent turn-taking rules, and user-machine-environment modeling for situated awareness—all user interface subsets that lend themselves to standardization.

Q: Are frequently asked questions, user guides, and YouTube videos enough?

A: FAQs, user guides, and YouTube videos are not enough. External collateral and observation do have their place, but the most effective discovery technique is user exploration. This method of user learning is dissonant with today’s variant, opaque, and ill-considered surface designs, which unknowingly send misleading and inconsistent cues that prevent users from forming an effective theory of the machine’s mind.

Q: What are the big takeaways from your SpeechTEK presentation?

A: The big takeaways from my presentation include a better understanding of the importance of timing, eye-opening detail about grounding junctures, the importance of user-initiated backup, and an interesting and subtle heuristic-development theory for empathic learning—all features that contribute directly to discoverability in voice applications of all kinds.