Voice is the next UI
While speech recognition technology has been around since the 1950s, recent advances in computing power and machine learning have made voice interfaces far more practical. The technology to support voice applications now both relatively inexpensive and powerful.
With these advances, we have seen more voice-driven applications brought to market, including devices that are “voice-first” like Amazon Echo and Google Home. Unit shipments of these devices have grown exponentially and are predicted to pass 30 million in American homes by the end of this year. One could argue that the companies and products oriented to voice-based use cases will be ahead of the pack in the coming years. Just like those that were first to drive a mobile-first or mobile-only strategy were winners of this past decade, those that drive voice platforms could be the market leaders of the 2020s.
Automated voice recognition (ASR) is now at a level that is roughly on par with humans. Just three years ago, word error rate averages among the top providers hovered around 25%. Today, the behemoths in the space, Google, Microsoft, and IBM are all claiming ~5%. Put another way, in a transcript of 100 words, only five might be incorrect. At this level of accuracy a whole world of novel applications and use cases emerges.
Against the backdrop of advancements in speech technology, the smartphone has reached maturity. The smartphone form factor hasn’t changed much over the last few years and the functionality across all the major manufacturers is similar. Benedict Evans of venture capital firm Andreesen Horowitz, believes smartphones are at the top of their product life cycle (see chart below). If this is true, then the question is: What is going to drive the next growth curve in tech? Is it Augmented Reality, Virtual Reality, or is it Voice?
The beauty of voice-based systems is that they leverage an interface that almost all humans can access: language. Even small advances in the technology can theoretically have massive, far-reaching global impact.
Fast, accurate automated transcription
Sonix automatically transcribes and translates your audio/video files in 38+ languages. Easily search, edit, and share your media files. Sonix is the best automated transcription software in 2023. Fast, accurate, and affordable. Millions of users from all over the world.
Includes 30 minutes of free transcription
Other Sonix articles
Hear from other Sonix users about how they record high quality audio
How did we get to where we are today in speech recognition? Sonix explains
The metallic, tin-like sound you hear in your audio is an unwelcome annoyance
Background noise is annoying and lowers the accuracy of your transcript
Background noise is distracting in videos and doesn't transcribe well
Room tone is the naturally occurring noise in the environment during your recording
Sonix is the best online video transcription service. It's fast, accurate, and affordable.
We accept over 100 different audio formats. Transcribe with Sonix today.
Made for folks who conduct tons of interviews (incl journalists and researchers)
Here are the 7 best free services to convert one audio file format to another
The best way to add subtitles and captions to your AVID Media Composer videos
Make your transcription more accurate by recording it the right way
Sonix: the most accurate automated transcripts for your audio and video
Sonix gives you the best accuracy when automatically transcribing video to text
Independently reviewed and Sonix scores the highest among automated services