With voicebot.ai reporting that Clubhouse has surpassed 10 million members – I am among them – I put together this 12-minute video explaining how Clubhouse works and my ten cents about whether you should try it. With a few bonus tips if you do indeed give it a “go”…
Category Archives: Future Possibilities
As part of Amazon’s long-term goal to make talking to Alexa more natural, a new “infer your intent” functionality has been built. Here’s an excerpt from this article from the “Verge”:
Finding new ways to use Amazon’s Alexa has always been a bit of a pain. Amazon boasts that its AI assistant has more than 100,000 skills, but most are garbage and the useful ones are far from easy to discover. Today, though, Amazon announced it’s launched a new way to surface skills: by guessing what users are after when they talk to Alexa about other tasks.
The company refers to this process as “[inferring] customers’ latent goals.” By this, it means working out any questions that are implied by other queries. Amazon gives the example of a customer asking “How long does it take to steep tea?” to which Alexa will answer “five minutes” before asking the follow-up: ”Would you like me to set a timer for five minutes?”
The annual list of predictions from experts for the voice industry from ‘voicebot.ai’ is always one of the more fascinating reads in this space. The predictions are organized by topic – my favorite topic is “voice moves to mobile devices of all sorts.” The topic of “personalization, emotion recognition & context” blows my mind. Check them all out!
And this “Voice Report” from ‘Rain’ lists the top 9 trends that agency is seeing…
Here’s some commentary from the “Rain” agency:
Digital conversations with friends and colleagues have traditionally revolved around text – typing on our keyboards or phones to communicate messages. Although the pandemic has created new demand for video conferencing, screen fatigue has started setting in, leaving space for a new kind of communication platform driven by voice. Several companies have started to populate this new audio ecosystem, trying to leverage voice conversations for personal and professional use.
From Discord to Clubhouse, these kinds of voice-driven platforms are becoming more common, and now we’re seeing mainstream platforms like Twitter recognizing value here as well. As many of us continue to work remotely, audio chat is emerging as a unique way to maintain human connection and rapport.
Here’s the intro from this article from “The Verge”:
Twitter plans to take on Clubhouse, the invite-only social platform where users congregate in voice chat rooms, with a way for people to create “spaces” for voice-based conversations right on Twitter. In theory, these spaces could provide another avenue for users to have conversations on the platform — but without harassment and abuse from trolls or bad actors, thanks to tools that let creators of these spaces better control the conversation.
The company plans to start testing the feature this year, but notably, Twitter will be giving first access to some of the people who are most affected by abuse and harassment on the platform: women and people from marginalized backgrounds, the company says.
In one of these conversation spaces, you’ll be able to see who is a part of the room and who is talking at any given time. The person who makes the space will have moderation controls and can determine who can actually participate, too. Twitter says it will experiment with how these spaces are discovered on the platform, including ways to invite participants via direct messages or right from a public tweet.
Recently, I blogged about this podcast, in which Voicebot.ai’s Bret Kinsella talks with John Kelvie from Bespoken about how “domains” will replace voice apps. I enjoyed John’s blog about this concept so much that I wanted to excerpt again from the blog:
Most of what is written above hinges on just a couple of key observations:
– Users do not remember invocation names
– Multi-turn dialogs sort-of work – in some cases they are useful and appropriate. But for the most part they annoy users and should be avoided.
If you accept these observations, everything else I’ve laid out follows fairly naturally. Of course, someone might come up with (or perhaps unbeknownst to me, already has) how to (a) improve users’ memories (b) remind them of phrases and experiences without annoying the love out of them, and/or (c) miraculously, markedly improve the state of the art of speech recognition. But assuming none of the above occur in the next 12-18 months, I believe most of what I have written is inevitable. At least, it is if we want to have a vibrant ecosystem for third parties.
The past few days I’ve been blogging about this podcast, in which Voicebot.ai’s Bret Kinsella talks with John Kelvie from Bespoken about how “domains” will replace voice apps. I wanted to offer one last excerpt from John’s blog, pulled from the bottom about how companies that are building their own voice assistants might be better served doing something else:
The devices in column one are inevitable and in some cases are already essential. Column two? Many may seem silly but some nonetheless will prove indispensable.
And these are JUST the devices with voice-capabilities embedded – the march of voice continues to be the march of IoT. Voice is our point of control for the ubiquitous computing power that exists around us. If you imagine a world in which the average cell phone owner has just ONE of each of the above items, the coming wave of voice-enabled devices looks like a tsunami. And if you factor in the devices under their control (thermostats, lights, power switches, appliances, etc.), it becomes even more staggering.
And the very good news is third parties have a huge role to play – the big guys need to provide the platforms and the device access, but they cannot do all the fulfillment. The future of the ecosystem is everyone playing nicely together in this new query-centric, domain-centric world, in which first and third-parties work together seamlessly.
For the platforms, it’s the chance to employ, at massive scale, the wisdom of the crowd – the wisdom of every brand, app builder, API and website on earth. What an amazing achievement it will be.
For third parties, it’s the opportunity to meet users, wherever they are, whatever they are doing – properly done, they will be just a short trip of the tongue away.
In this podcast, Voicebot.ai’s Bret Kinsella talks with John Kelvie from Bespoken about how “domains” will replace voice apps. It’s an interesting discussion – albeit a little hard to follow at times – so I recommend reading John’s blog about this concept before you listen to the podcast. Here’s an excerpt from the blog:
If you want to talk to someone, call your Mom. If you want to build a voice experience:
– One-shot is preferred
– Where one-shot is impossible, quick, contextual follow-ups are the next best option
– As a last resort, attempt an extended multi-turn dialog
Is this to say that extended interactions are a complete failure? Absolutely not! But describing them as conversations misses the point. How are they not conversations?
– They are not open-ended
– They lack important context – such as body language, intonation, emphasis, and past interactions
– They have very poor understanding – both in terms of speech and intent recognition
All of these things are likely to improve radically over a five to ten-year time horizon. But 12-18 months? Not so much. Almost certainly not enough to change what is feasible for most implementers.
RAIN has posted its list of predictions for this year – including this one:
On the heels of Beeb, Erica, and Hey Mercedes, 2020 will see brands in many industries seeking more control over their voice assistant footprint – spanning data and the customer experience – in the form of creating “owned” voice assistants in their brand’s image. There will be another set of major brands – from automotive to consumer electronics, financial services to QSR – that introduce their own voice agents, with their own personas and voices, in the year to come.
The Voice Interoperability Initiative will begin to connect these more disparate, specialist assistants with more generalist intelligences like Alexa, so as to make them more useful in more places.
In this podcast, Voicebot.ai’s Bret Kinsella talks with Maarten Lens-FitzGerald (the “Dutch Cowboy”) (at the 38:40 mark) to get the perspective of how voice will fare in 2020. Here are some of the points made:
1. Last January, Maarten predicted that 2019 would be the ‘year of boredom’ in voice. Other than the negative stories about privacy (or lack thereof), Martin’s prediction became fairly true.
2. Maarten talked about two types of confusion – for users and for organizations.
3. For users, there will be confusion because of three things: 1. the “walled garden” where each major voice provider has their own ecosystem that isn’t necessarily compatible with others; 2. existing voice users go “deeper” with their voice experiences and getting beyond playing music and checking the weather can sometimes lead to failed experiences; and 3. the laggards to voice will tend to be those that are less tech savvy and will need better instruction.
4. Before the Web in the mid-’90s, walled gardens existed in the online world (eg. bulletin boards), but the birth of browsers broke that down. Walled gardens still exist for mobile. Walled gardens for voice not likely to break down anytime soon because the major players have invested billions and don’t have an incentive to collaborate on unifying standards. But users are beginning to break down that wall a little bit (eg. using Amazon’s Echo Buds with their Apple iPhone). Still too early for most users to live with just one major player’s ecosystem.
5. Organizations are confused about the ROI for voice. So its a belief-based technology right now. You’ll need someone convincing the boss with creative numbers and good storytelling. Too hard to tell yet what voice works best for – is it customer service? Content? We don’t know yet. In a way, voice faces the same type of challenge that augmented reality does.