And you thought the guy in the airport terminal talking on speaker was a problem.
The future might get loud, according to Tobias Dengel, president at digital services provider WillowTree and author of the 2023 book The Sound of the Future, at least if his vision comes true: a reality in which people mostly talk to their devices to get the results they need, from a text message to an on-demand movie to whatever generates from generative AI, like entire code lines from a verbal prompt.
Early voice-tech iterations involved one mode of communication: a speech input (“Hey, Siri!”) and a speech output (Siri reads the weather). For the technology to truly find its voice, its developers must offer many modes of communication, according to Dengel, like verbal prompts leading to lightning-fast results through text, imagery, or other presentations.
“That’s our core thesis: That every app is going to be voice-powered; it’s going to be a multimodal experience,” Dengel said.
Dengel told IT Brew more about how he sees and hears the future.
The responses below have been edited for length and clarity.
What’s an example of a multimodal experience?
Tobias Dengel: The one that we have created as users every single day—without realizing it—is chat or texting. Everyone’s speaking into their mobile device because it’s so much faster. But we don’t listen to the response; we read the response. So, even though text-messaging apps weren’t designed as multimodal, we as users have made them multimodal.
Why do you think voice tech hasn’t had the impact of ChatGPT or other emerging tech?
Dengel: It’s just been applied incorrectly because it’s new…The original applications were “Hey, we have voice technology, so let’s replicate human conversation.” That’s actually not going to be the use case that wins. It’s “Yes, we’re going to speak to devices,” but we don’t want to listen to them. We want them to just do things and respond to us. We’re just at the start of that, and apps are starting to do that.
Top insights for IT pros
From cybersecurity and big data to cloud computing, IT Brew covers the latest trends shaping business tech in our 4x weekly newsletter, virtual events with industry experts, and digital guides.
What kind of future do you envision with voice tech? Is it positive? Is it loud?
Dengel: Adoption of voice is kind of a bathtub curve right now; it’s heavily used by people who are 20 and under, and it’s heavily used by people who are 65 and older. And the reason for people 65 and older using it is they never really learned how to type, right? They hated the computer interface, especially on the phone. So, they’re always talking into the device. And, kids are just growing up with this from day one, and their expectation is they should be able to ask the computer, phone, or device anything, and it just kind of does it.
I think we’re all going to be talking to our devices and apps a lot more, and it’s going to make the world much, much more efficient. Just think about all the time you spend…typing. If that could be reduced by two-thirds, by primarily using voice, think of the benefits of that. It frees us up from drudgery and allows us to focus on the more interesting, creative parts of the human experience at the end of the day.