-
公开(公告)号:DE19721198C2
公开(公告)日:2002-01-10
申请号:DE19721198
申请日:1997-05-21
Applicant: IBM
Inventor: KANEVSKY DIMITRI , ROUKOS SALIM ESTEPHAN , SEDIVY JAN
Abstract: A statistical language model for inflected languages, having very large vocabularies, is generated by splitting words into stems, prefixes and endings, and deriving trigrams for the stems, ending and prefixes. The statistical dependence of endings and prefixes from each stem is also obtained, and the resulting language model is a weighted sum of these scores.
-
公开(公告)号:DE19721198A1
公开(公告)日:1997-12-11
申请号:DE19721198
申请日:1997-05-21
Applicant: IBM
Inventor: KANEVSKY DIMITRI , ROUKOS SALIM ESTEPHAN , SEDIVY JAN
Abstract: The identified roots and end parts of numerous words are extracted, and a language text data is converted into a text consisting of roots and end parts as special dictionary entries. A vocabulary is formed from the converted text. For each root, a number of probabilities are generated for the end parts, of such a magnitude as to be determined by the number of end parts. The results of the extraction of the identified roots and end parts are used for generating the language model, for a random combination of roots and end parts which yield the highest recognition valuation.
-
公开(公告)号:GB2331392B
公开(公告)日:2001-09-26
申请号:GB9820182
申请日:1998-09-17
Applicant: IBM
Abstract: A fast vocabulary independent method for spotting words in speech utilizes a preprocessing step and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing includes a Viterbi-beam phone level decoding using a tree-based phone language model. The coarse search matches phone-ngrams to identify regions of speech as putative word hits, and the detailed search performs an acoustic match at the putative hits with a model of the given word included in the vocabulary of the recognizer.
-
公开(公告)号:GB2331392A
公开(公告)日:1999-05-19
申请号:GB9820182
申请日:1998-09-17
Applicant: IBM
Abstract: A fast vocabulary independent method for spotting words in speech utilizes a preprocessing step and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing includes a Viterbi-beam phone level decoding 14 using a tree-based phone language model. The coarse search 20 matches phone-ngrams to identify regions of speech as putative word hits, and the detailed search 22 performs an acoustic match at the putative hits with a model of the given word included in the vocabulary of the recognizer.
-
-
-