Performance and Benchmarks
In heads-up comparisons, Mod9 outperforms some of the industry’s most widely-used ASR solutions on both accuracy and performance metrics.
Word Error Rate, Precision
Comparative Performance
Vendor | Service | Word Error Rate (%) | Transcription Precision (%) | Speaker Labels | Price ($/minute) | Automatic |
TranscribeMe | Verbatim Transcription | 4.3 | 97.6 | ✅ | 2.75 | ❌ |
Mod9 | ASR | 7.1 | 95.6 | ✅ | 0.01 | ✅ |
Cloud Speech-to-Text | 12.3 | 93.2 | ❌ | 0.024 | ✅ | |
Amazon | AWS Transcribe | 12.9 | 93.3 | ❌ | 0.024 | ✅ |
VoiceBase | High Accuracy Transcription | 15.3 | 91.5 | ✅ | 0.02 | ✅ |
IBM | Watson Speech-to-Text | 15.7 | 90.7 | ✅ | 0.02 | ✅ |
● This “Switchboard” test set has been widely used for evaluating speech recognition research. As of August 2017, the best reported performance is 5.1% WER by Microsoft Research.
● Word Error Rate is a measure of “verbatim” transcription accuracy that counts each word insertion, deletion, or substitution — including conversational “disfluencies” (e.g. repeated words).
● Transcript Precision is the fraction of output words that are correct, not penalizing missed words. This may be more intuitive than WER, and gives consistently high scores to human transcription.
● Speaker labels improve transcription quality and should be easily determined from dual-channel audio recordings, as formatted in the original files used for this evaluation. However, many systems only accept single-channel audio or will automatically downmix dual-channel audio; in these cases, the audio must be split into separate files and submitted as two requests, denoted as (duplex).
● Speaker diarization can be applied by Remeeting, and some of the other systems, to automatically identify speakers from audio that has been mixed down to a single channel, denoted as (mono).
● Punctuation and capitalization can be automatically added by Remeeting and some of the other systems. (This benchmark does not score punctuation or capitalization accuracy.)
● Help us to improve these references! We have discovered mistakes in the past and will share our corrections. Please report bugs to research@remeeting.com
Interested in learning more? Please fill in your name, email, and a short message, and we'll be in touch.
Office
1936 University Ave, Suite 330
Berkeley, CA 94704