-
公开(公告)号:AU2003268328A1
公开(公告)日:2005-02-14
申请号:AU2003268328
申请日:2003-08-29
Applicant: IBM
Inventor: POTAMIANOS GERASIMOS , CONNELL JONATHAN H , HAAS NORMAN , MARCHERET ETIENNE , NETI CHALAPATHY VENKATA
IPC: G10L15/24
Abstract: Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.