![]() |
|
You can use speech recognition packages such as Dragon Naturally SpeakingTM, ScanSoft's RealSpeakTM, Sail Technologies Media MiningTM, or IBM's ViaVoice® to generate a transcript for AutoCaption. You can also use them to dictate text directly into AutoCaption. |
|
Each voice recognition package manufacturer makes their own accuracy claims. In our experience 90+% accuracy can be obtained if you have an excellent audio system and have diligently trained the system to your voice. |
|
However, I have yet to see a speech recognition package that will consistently make passable transcripts from speakers it wasn't trained on. |
|
As a rule, speech recognition will do a pretty nice job on your voice if you speak clearly and consistently. Be aware, however, that most of these systems are designed to "learn" one speaker's speech patterns and aren't very good with multiple speakers, accents, or when there is background noise or music. |
|
Transcript "cleanup" will always be necessary. You will need to resolve conflicts like deciding which homonym to use (ie: "too," "to," "two" and "2," or "Sue," "sue," or "sew") and adding punctuation. |
|
The punctuation cleanup task is particularly time consuming in extemporaneous events, like talk shows, reality shows, lectures, sermons and interviews. That's because most people don't naturally speak in complete sentences and it takes a bit of mucking around to make comprehensible captions. |
|
Voice recognition software vendors sometimes suggest vocalizing punctuation -- a popping sound for a period, a long "ess" for a comma and so forth. While you might sound comical it does get the job done. Some AutoCaption users report that voice recognition does not save time compared to transcription, but they use it anyway to help protect their captioners from repetitive motion injuries. These folks have employees captioning all day every day and don't want lose good captioners or see their workman's compensation insurance go through the roof.
|
|
"In Louisiana, we have a problem with Southern drawl and what we call lazy mouth" reports Capt. John Dunn who is in charge of the Shreveport police department's voice recognition equipment. "Because of that, the system [non-emergency call routing] often doesn't recognize what [callers] say" he continues, observing that more often than not calls are routed to the wrong place. According to CNN and the Associated Press (11/17/03) even the interim Chief of Police, Mike Campbell said "I can count on one hand when I have been transferred to where I've wanted to go, and I know the system. I can imagine how frustrating it must be for a citizen." |
Site
Copyright 2003 Image Logic Corporation
Artwork ©Bradley Bleeker
tools_caption_speechrecognition.html 31118