AutoCaption Logo
Go Ahead, Use Speech Recognition

Gal at computer transcribing with voice recognition or a transcription machine.

Next Captioning Tool

Captioning Tool List

Great Support

You can use speech recognition packages such as Dragon Naturally SpeakingTM, ScanSoft's RealSpeakTM, Sail Technologies Media MiningTM, or IBM's ViaVoice® to generate a transcript for AutoCaption.

You can also use them to dictate text directly into AutoCaption.

Each voice recognition package manufacturer makes their own accuracy claims.  In our experience 90+% accuracy can be obtained if you have an excellent audio system and have diligently trained the system to your voice.

However, I have yet to see a speech recognition package that will consistently make passable transcripts from speakers it wasn't trained on.

As a rule, speech recognition will do a pretty nice job on your voice if you speak clearly and consistently.  Be aware, however, that most of these systems are designed to "learn" one speaker's speech patterns and aren't very good with multiple speakers, accents, or when there is background noise or music.

Transcript "cleanup" will always be necessary.  You will need to resolve conflicts like deciding which homonym to use (ie: "too," "to," "two" and "2," or "Sue," "sue," or "sew") and adding punctuation.

The punctuation cleanup task is particularly time consuming in extemporaneous events, like talk shows, reality shows, lectures, sermons and interviews.  That's because most people don't naturally speak in complete sentences and it takes a bit of mucking around to make comprehensible captions.

Voice recognition software vendors sometimes suggest vocalizing punctuation -- a popping sound for a period, a long "ess" for a comma and so forth.  While you might sound comical it does get the job done.

Some AutoCaption users report that voice recognition does not save time compared to transcription, but they use it anyway to help protect their captioners from repetitive motion injuries.  These folks have employees captioning all day every day and don't want lose good captioners or see their workman's compensation insurance go through the roof.

 

More details follow

"In Louisiana, we have a problem with Southern drawl and what we call lazy mouth" reports Capt. John Dunn who is in charge of the Shreveport police department's voice recognition equipment.  "Because of that, the system [non-emergency call routing] often doesn't recognize what [callers] say" he continues, observing that more often than not calls are routed to the wrong place.

According to CNN and the Associated Press (11/17/03) even the interim Chief of Police, Mike Campbell said "I can count on one hand when I have been transferred to where I've wanted to go, and I know the system.  I can imagine how frustrating it must be for a citizen."

 

double rule
Summary         Costs         Buy         Powerful_Tools         Technical

The_Process         Downloads         Contact_Us         Home
double rule

W3C HTML 4.01 Strict certificationSite Copyright 2003 Image Logic Corporation
Artwork ©Bradley Bleeker
tools_caption_speechrecognition.html 31118