Is voice the next major UI layer for software?

Voice is the next UI

While speech recognition technology has been around since the 1950s, recent advances in computing power and machine learning have made voice interfaces far more practical. The technology to support voice applications now both relatively inexpensive and powerful.

With these advances, we have seen more voice-driven applications brought to market, including devices that are “voice-first” like Amazon Echo and Google Home. Unit shipments of these devices have grown exponentially and are predicted to pass 30 million in American homes by the end of this year. One could argue that the companies and products oriented to voice-based use cases will be ahead of the pack in the coming years. Just like those that were first to drive a mobile-first or mobile-only strategy were winners of this past decade, those that drive voice platforms could be the market leaders of the 2020s.

Automated voice recognition (ASR) is now at a level that is roughly on par with humans. Just three years ago, word error rate averages among the top providers hovered around 25%. Today, the behemoths in the space, Google, Microsoft, and IBM are all claiming ~5%. Put another way, in a transcript of 100 words, only five might be incorrect. At this level of accuracy a whole world of novel applications and use cases emerges.

Against the backdrop of advancements in speech technology, the smartphone has reached maturity. The smartphone form factor hasn’t changed much over the last few years and the functionality across all the major manufacturers is similar. Benedict Evans of venture capital firm Andreesen Horowitz, believes smartphones are at the top of their product life cycle (see chart below). If this is true, then the question is: What is going to drive the next growth curve in tech? Is it Augmented Reality, Virtual Reality, or is it Voice?

Sonix - Voice is the new UI

The beauty of voice-based systems is that they leverage an interface that almost all humans can access: language. Even small advances in the technology can theoretically have massive, far-reaching global impact.

Fast, accurate automated transcription

Sonix automatically transcribes and translates your audio/video files in 53+ languages. Easily search, edit, and share your media files. Sonix is the best automated transcription software in 2025. Fast, accurate, and affordable. Millions of users from all over the world.

Fast, accurate automated transcription

Includes 30 minutes of free transcription

Is voice the next major UI layer for software?

Voice is the next UI

Fast, accurate automated transcription

Other Sonix articles

Tips on how to capture great audio

Transcription software

Verbatim transcription

Why should you transcribe?

How much does transcription cost?

History of speech recognition

How to remove the metallic sound

Remove background audio noise

Remove background noise in videos

Room tone: what is it?

Audio transcription with Sonix

Want accurate video transcription?

Quickly convert audio to text

Transcribe your audio files

Transcribe your video files

Interview transcription with Sonix

The seven best audio converters

How to make voices sound better

Remove crosstalk and mic bleed

How to add subtitles in AVID

How to mic a two-person interview

Six best tips for transcriptionists

AI, ML, and NLP

Is voice the next major UI?

Word error rate

Free Automated Speech to Text

Automatic Video To Text Converter

A comparison of automation services

2019 Webby Awards nominee

Other Tutorials

The best automated transcription service in 2025

Easily convert your audio to text with Sonix

You might be interested in