Thin paper 0070

Is voice the next major UI layer for software?

Yes. The technology to support voice applications is now both relatively inexpensive and powerful.

Voice is the next ui

While speech recognition technology has been around since the 1950s, recent advances in computing power and machine learning have made voice interfaces far more practical. The technology to support voice applications now both relatively inexpensive and powerful.

With these advances, we have seen more voice-driven applications brought to market, including devices that are “voice-first” like Amazon Echo and Google Home. Unit shipments of these devices have grown exponentially and are predicted to pass 30 million in American homes by the end of this year. One could argue that the companies and products oriented to voice-based use cases will be ahead of the pack in the coming years. Just like those that were first to drive a mobile-first or mobile-only strategy were winners of this past decade, those that drive voice platforms could be the market leaders of the 2020s.

Automated voice recognition (ASR) is now at a level that is roughly on par with humans. Just three years ago, word error rate averages among the top providers hovered around 25%. Today, the behemoths in the space, Google, Microsoft, and IBM are all claiming ~5%. Put another way, in a transcript of 100 words, only five might be incorrect. At this level of accuracy a whole world of novel applications and use cases emerges.

Against the backdrop of advancements in speech technology, the smartphone has reached maturity. The smartphone form factor hasn’t changed much over the last few years and the functionality across all the major manufacturers is similar. Benedict Evans of venture capital firm Andreesen Horowitz, believes smartphones are at the top of their product life cycle (see chart below). If this is true, then the question is: What is going to drive the next growth curve in tech? Is it Augmented Reality, Virtual Reality, or is it Voice?

Sonix - Voice is the new UI

The beauty of voice-based systems is that they leverage an interface that almost all humans can access: language. Even small advances in the technology can theoretically have massive, far-reaching global impact.

Other Sonix Articles 📃

Tips on how to capture great audio

Hear from other Sonix users about how they record high quality audio

Why should you transcribe?

Five reasons you should be transcribing your audio and video files

History of speech recognition

How did we get to where we are today in speech recognition? Sonix explains

How to remove the metallic sound

The metallic, tin-like sound you hear in your audio is an unwelcome annoyance

Six best tips for transcriptionists

Helping transcriptionists work faster and be more accurate

AI, ML, and NLP

Artificial Intelligence, Machine Learning, and Natural Language Processing

Is voice the next major UI?

We think that it will change how we interact with technology

Word error rate

How do you judge accuracy in the realm of speech recognition?

A comparison of automation services

Independently reviewed and Sonix scores the highest among automated services

Start transcribing with Sonix 🚀

Sonix transcribes, timestamps, and organizes your audio and video files so they are easy to search, edit, and share. Start your free trial today—all features included, no credit card required.

Try Sonix for free

Your first 30 minutes of transcription are free