-
1.
公开(公告)号:CA2203132A1
公开(公告)日:1997-05-05
申请号:CA2203132
申请日:1995-11-04
Applicant: IBM
Inventor: BANDARA UPALI , KUNZMANN SIEGFRIED , LEWIS BURN L , MOHR KARLHEINZ
IPC: G10L15/183 , G10L15/197 , G10L9/00 , G10L9/18
Abstract: Disclosed are a method and an apparatus for adapting, particularly reducing, the size of a language model, which comprises word n-grams, in a speech recognition system. The invention provides a mechanism to discard those n-grams for which the acoustic part of the system requires less support from the language model to recognize correctly. The proposed method is suitable for identifying those trigrams in a language model for the purpose of discarding during the built-time of the system. Provided is also another automatic classification scheme for words which allows the compression of a language model, but under retention of accuracy. Moreover it allows an efficient usage of sparsely available text corpora because even singleton trigrams are used when they are helpful. No additional software tools are needed to be developed because the main tool, the fast match scoring, is a module readily available in the known recognizers themselves. Further improvement of the method is accomplished by classification of words according to the common text in which they occur as far as they distinguish from each other acoustically. The invention opens the possibility to make speech recognition available in low-cost personal computers (PC's), even in portable computers like Laptops.
-
2.
公开(公告)号:CA2203132C
公开(公告)日:2004-11-16
申请号:CA2203132
申请日:1995-11-04
Applicant: IBM
Inventor: KUNZMANN SIEGFRIED , BANDARA UPALI , LEWIS BURN L , MOHR KARLHEINZ
IPC: G10L15/183 , G10L15/197 , G10L9/00 , G10L9/18
Abstract: Disclosed are a method and an apparatus for adapting, particularly reducing, the size of a language model, which comprises word n-grams, in a speech recognition system . The invention provides a mechanism to discard those n-grams for which the acoustic part of the system requires less support from the language model to recognize correctly. The proposed method is suitable for identifying those trigrams in a language model for the purpose of discarding during the built-time of the system. Provided is also another automatic classification scheme for words which allows the compression of a language model, but under retention of accuracy. Moreover it allows an efficient usage of sparsely available text corpora because even singleton trigrams are used when they are helpful. No additiona l software tools are needed to be developed because the main tool, the fast match scoring, is a module readily available in the known recognizers themselves. Further improvement of the method is accomplished by classification of words according to the common text in whic h they occur as far as they distinguish from each other acoustically. The invention opens the possibility to make speech recognition available in low-cost personal computers (PC's), even in portable computers like Laptops.
-