A novel software that automatically and accurately synchronizes pre-segmented transcripts with corresponding videos.
There are several issues involved with synchronizing captions with corresponding video presentations, such as speech recognition error, difficulty aligning recognized words with the spoken dialogue, and difficulty estimating when unrecognized words were spoken.
Researchers at the University of California, Santa Barbara have developed a novel software that automatically and accurately synchronizes pre-segmented transcripts with corresponding videos. The system first creates a new speech recognition system (SRS) language model for each video, which greatly improves the accuracy in identifying spoken words. It then uses recognized time-stamped words to time stamp aligned captions. Finally, unrecognized words are given estimated time-stamps along with a reported error bound, giving the content creator a measure of the caption accuracy. This software is applicable to video captioning.
This software is available for licensing.
Captioning, Autocap, indmedia