What is word error rate?
We love sharing with you more about automated speech transcription.
Word error rate formula
Word error rate often referred to as WER is a way to measure the performance of an automatic speech recognition (ASR) system. It is tricky to measure because the "ASR result" can have a different length than the "Voice input."
Here is a simple way to understand how WER is calculated:

To help clarify further, here are some definitions:
Deletion by ASR system:
Voice input: I surf small waves
ASR result: I surf waves
Insertion by ASR system:
Voice input: I surf waves
ASR result: I surf small waves
Substitution by ASR system:
Voice input: I surf small waves
ASR result: I surf all waves
Who is winning?
Speech recognition technology has come a long way since the 1950s. Our earlier post a short history of speech recognition talks about some of the key events along the way. I talked about how we've reached (or almost reached depending on who you talk to) an inflection point in automated speech recognition.
The largest technology companies like Google, IBM, and Microsoft are all clamoring for the accuracy title. Below is the chronology of the claims made in 2017:
Mar 2017: IBM claims 5.5% word error rate
May 2017: Google claims 4.9% word error rate
Aug 2017: Microsoft claims 5.1% word error rate
We'll continue to update this as new claims are made.
Try Sonix for free
Sonix transcribes, timestamps, and organizes your audio and video files so you can search, edit, and share your media.
Includes 30 minutes of free transcription