Systems and methods for extracting meaning from multimodal inputs using finite-state devices

Invention Grant

US08355916B2 Systems and methods for extracting meaning from multimodal inputs using finite-state devices 有权

Title translation: 使用有限状态设备从多模态输入中提取意义的系统和方法

Please log in to see more content

Patent Title: Systems and methods for extracting meaning from multimodal inputs using finite-state devices
Patent Title (中): 使用有限状态设备从多模态输入中提取意义的系统和方法
Application No.: US13485574

Application Date: 2012-05-31
Publication No.: US08355916B2

Publication Date: 2013-01-15
Inventor: Srinivas Bangalore , Michael J. Johnston
Applicant: Srinivas Bangalore , Michael J. Johnston
Applicant Address: US GA Atlanta
Assignee: AT&T Intellectual Property II, L.P.
Current Assignee: AT&T Intellectual Property II, L.P.
Current Assignee Address: US GA Atlanta
Main IPC: G10L17/00
IPC: G10L17/00

Systems and methods for extracting meaning from multimodal inputs using finite-state devices

Abstract:

Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

Abstract(Chinese):

多模式话语包含多种不同的模式。这些模式可以包括语音，手势和笔，触觉和注视输入等。本发明使用这些模式中的一个或多个的识别结果为这些模式中的一个或多个其他模式的识别过程提供补偿。在各种示例性实施例中，多模式识别系统从这些模式中的一个或多个输入一个或多个识别网格，并且生成要由一个或多个模式识别器使用以识别一个或多个其他模式的一个或多个模型。在一个示例性实施例中，手势识别器输入手势输入并向多模式解析器输出手势识别格点。多模式解析器生成语言模型并将其输出到自动语音识别系统，其使用所接收的语言模型来识别对应于识别的手势输入的语音输入。

Public/Granted literature

US20120303370A1 SYSTEMS AND METHODS FOR EXTRACTING MEANING FROM MULTIMODAL INPUTS USING FINITE-STATE DEVICES Public/Granted day:2012-11-29

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L17/00	讲话者辨认或验证