In this podcast, Voicebot.ai’s Bret Kinsella talks with Patricia Scanlon, CEO of Soapbox Labs, the leader in automated speech recognition for kids. Here are some of the points made:
1. Existing voice solutions don’t work so well for kids because they are different both physically (eg. vocal cords) and behavorially (eg. slower or faster speaking speeds) than adults.
2. Voice studies tend to note that we are at 95% accuracy today – but that’s not quite accurate because that might happen only if perfect circumstances exists. Sort of like a good Boolean search on the Web – if you put in the perfect search terms, you are more likely to get a better result. With voice, you might get 95% if the circumstances are perfect (egs. crisp speaker, lack of ambient noise in the background). Bret noted how his research shows that the #1 desire of consumers for voice is to be understood better.
3. Patricia has spent more than six years collecting voice samples from kids. She explains how you need a balanced phonetics dataset to really move the needle for kids, collecting samples from different styles of speech and collecting from a large number of kids, not having kids say the same 100 words and using both a controlled & uncontrolled environment. Not just what they’re saying but how they’re saying it. Collecting samples from children with different accents and languages. Many adult datasets of speech already existed – but only a few decades-old datasets of kids were around before Patricia started her collection (those old ones were conducted as part of grant-funded controlled academic exercises, so they had limited utility).
4. Soapbox Labs is in (or will be) market for EdTech, language learning, smart toys and gaming. For example, screening preliterate kids for learning difficulties. Screening for fluency levels. A teacher doesn’t have the time to listen to all their kids reading more than sporidically. With voice, this “one-on-one” evaluation can happen more often as voice can help spot & correct errors and prompt encouragement. And do this cost-effectively on a large scale.
5. Near the end of the podcast, there is a good conversation about possibly moving kid interactions with voice devices “to the edge,” meaning off the cloud (and thus being more private). And whether – and when – this could be done cost-effectively (and how much privacy really matters in this context). Another item discussed is to bifurcate the types of answers given so that they are age appropriate.
6. At the end, there also is a discussion about what the definition of a “conversation” is – right now, voice interactions are instructional & transactional as we’re only scratching the surface of natural language understanding.
Two summers ago, as noted in this article, Universal released a new “Jurassic Park” movie – and as a tie-in, it released a related audio adventure called “Jurassic World Revealed.” This audio game included “premium” chapters that folks could pay for after they heard the first chapter for free.
There are five premium chapters at a cost of $5 each – meaning that you would wind up spending twice the amount for a movie ticket if you got hooked on the audio game and listened to all of the premium chapters.
I didn’t spend the money to listen to the premium content (and based on the comments for the game, people enjoyed it but were mad they paid money), it’s a role-playing game where you choose your role and then you are asked questions. The story proceeds based on how you answer the questions.
Universal used voice actors – so I’m sure it cost them a pretty penny to put this together – but I doubt many folks paid for the premium content. But I guess we’ll find out if we see other movie studios following their lead…
So when I figured out that “Yes Sire” wasn’t appropriate for my 9-year old nephew, we tried the “Magic Door.” It was a good choice. It was an adventure game with a magical land, where you collect hidden items, solve riddles and help creatures. Tolkien stuff. “Alexa, open the magic door.”
Loved this interview by Bret Kinsella of “voicebot.ai” with Tellables’ Amy Stapleton. Amy is one of the first people to use voice assistants to tell stories, as she uses her company, Tellables, as a publishing platform for conversational stories. Here’s some of the cool things that I learned during the podcast:
1. Amy distinguishes how her “conversational storytelling” platform differs from “games” even though someone using it gets rewarded in some ways. Her platform offers storytelling content with an interactive component. But what Tellables does isn’t quite gaming even though some “choose your own adventure” games have some storytelling in them.
2. Amy’s “Tricky Genie” was one of the first stories available on Alexa, enabling her to gain significant rewards through Amazon’s reward program (being a first-mover was important; Amy believes it’s important to build an audience first before trying monetization). “Tricky Genie” is a one-on-one experience with over 100 scenarios available.
3. Amy’s latest offering – “My Box of Chocolates” – uses a “Polly voice” to tell a short story. The Polly voice selected for a particular story has a personality that fits that particular story. The stories – typically 200 words or less – can be heard by yourself or in a group. Her goal is to offer stories that make you think. The interactive components at the end of the stories help to get you thinking. The interaction hook is an important way to make the voice experience special. After enabling the skill, “Alexa, open my box of chocolates”
4. At the end of the short story, a “party question” is provided. If you’re convened as a group, the party question is a great way to provoke a conversation. So you could hold a book club meeting and listen to “My Box of Chocolates” as a way to mix things up for a change. And saves folks the embarrassment of saying they haven’t read the book!
5. So in a sense, this type of platform is the flip-side of the danger of screens taking us further & further out of our communities – with voice, there is an opportunity to bring people back as a community.
6. For “My Box of Chocolates,” Amy reaches out to authors to submit short stories. Since stories being told by a Polly voice on a device is different than reading short stories, there is a bit of an art to creating content that works on this platform. So Amy winds up doing a little bit of training for authors that are new to this.
7. When you build a skill, Amy recommends that you build it so that you can continuously add new content. Keep people coming back from more.
I babysat my 9-year old nephew over the weekend and was thinking of giving the popular game – “Yes Sire” – a whirl. But then I learned the game was rated “mature” – so that wasn’t happening. There is an easy way to set up “parental controls” via your Alexa app by the way. So we decided to try the game with a few friends. “Alexa, play Yes Sire.”
I can see why the game is so popular. It was fun – and easy to play. Strategy was involved. You sit as a medieval lord of the realm, presented with an expanding array of choices that become more difficult as you go. Make good choices and stay in power. Playing it a second time, the skill asks if you want to make an in-skill purchase – something that has worked well for “Volley,” the company who created “Yes Sire.”
What if you could combine storytelling in a real book with the benefits that voice offers? That’s starting to happen. For example, check out the intro from this article:
Melissa and Matt Hammersley got the idea for Novel Effect when they were expecting their first child, Eleanor. They wanted to create something that would help them bond as a family and use technology to bring a little more magic into her life. A light bulb went off at their baby shower, when a friend did a theatrical reading of a book that would soon become Eleanor’s. What if technology could simulate that experience and turn story time into an almost cinematic experience?
They brought on a team of experts and began building Novel Effect, which uses voice recognition technology to follow along when someone reads a book out loud, adding music, sound effects, and other features.
More recently, the NY Times has started providing invocations embedded in articles with so that you can learn more about the article’s topic through your voice assistant.
One of the natural fits with voice are trivia games. You can play by yourself or with friends. Here’s a list of some of the more popular:
1. Jeopardy – The long-running TV show has a variety of voice spin-offs: regular, teen, sports.
2. Three Questions – There are thousands of questions available but you are fed three at a time. You earn points for correct answers – with the possibility of earning a spot on the leaderboard.
3. Question of the Day – You get thrown one question per day.
4. Drivetime – My personal favorite is a mobile app, not a skill. But it’s voice-enabled. I reviewed it a few months ago.
5. Movie Challenge – You hear movie clips and guess what movie they’re from. You can play solo or in teams. “I’ll be back.”
6. Song Quiz – You hear song clips and guess the artist & song title.
7. Beat the Intro – You hear song clips and guess artist & song title. You can play by yourself or in team mode.
8. Trivial Pursuit Tap – Yep, it’s from Hasbro. Six categories of questions as you compete against friends.
9. Official Harry Potter Quiz – For those that can’t get enough of Harry Potter. Three new questions daily – you play by yourself.
10. Who Wants to Be a Millionaire – Like the TV show. There are 15 new questions daily – you play by yourself.
11. Twenty Questions – Not technically trivia, as you are essentially providing the trivia for Alexa to guess.
Yesterday, Google announced its vision for the future of gaming with its “Stadia” cloud platform. As this Verge article notes, many of the details about how Stadia will actually work are still unknown – it won’t be publicly available until later this year.
But the Stadia controller – the only physical piece of Stadia – was made available and a Google Assistant is embedded within it (here’s a 2-minute video of when the controller was announced). The first game controller with voice directly available!
It’s too soon to know how big a role will voice play in Stadia – it will depend on how game developers create in-game experiences utilizing it. But it’s definitely something that distinguishes this controller from all others…
By the way, the Verge reporter loved the controller, the first one that Google has ever made. Watching the Stadia launch was uplifting – all sorts of new possibilities for video & gaming were readily apparent. Hope springs eternal…