Out of vocabulary pattern learning
Abstract:
A method for adapting a speech recognition system for out-of-vocabulary, comprising, decoding by a hybrid speech recognition a speech including out-of-vocabulary terms, thereby generating graphemic transcriptions of the speech with a mixture of recognized in-vocabulary words and unrecognized sub-words, while keeping a track of the decoded segments of the speech, determining in the transcription sequences of sub-words as candidate out-of-vocabulary words based on a first condition with respect to lengths of the sequences of sub-words and a second condition with respect to the number of repetitions of the sequences, audibly presenting to a user the candidate out-of-vocabulary words from the corresponding segments of the speech according to the track, and receiving from the user indications of valid words corresponding to audible presentations of the sequences of sub-words in the candidate out-of-vocabulary words, and training a speech recognition to additionally recognize the candidate out-of-vocabulary words, thereby adapting the speech recognition to recognize out-of-vocabulary words, wherein the method is performed on an at least one computerized apparatus configured to perform the method, and an apparatus for performing the same.
Public/Granted literature
Information query
Patent Agency Ranking
0/0