Model restructuring for client and server based automatic speech recognition

Invention Grant

US08635067B2 Model restructuring for client and server based automatic speech recognition 失效

Title translation: 基于客户端和服务器的自动语音识别模型重组

Please log in to see more content

Patent Title: Model restructuring for client and server based automatic speech recognition
Patent Title (中): 基于客户端和服务器的自动语音识别模型重组
Application No.: US12964433

Application Date: 2010-12-09
Publication No.: US08635067B2

Publication Date: 2014-01-21
Inventor: Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
Applicant: Pierre Dognin , Vaibhava Goel , John R. Hershey , Peder A. Olsen
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Otterstedt, Ellenbogen & Kammer, LLP
Agent Anne V. Dougherty
Main IPC: G10L15/14
IPC: G10L15/14

Model restructuring for client and server based automatic speech recognition

Abstract:

Access is obtained to a large reference acoustic model for automatic speech recognition. The large reference acoustic model has L states modeled by L mixture models, and the large reference acoustic model has N components. A desired number of components Nc, less than N, to be used in a restructured acoustic model derived from the reference acoustic model, is identified. The desired number of components Nc is selected based on a computing environment in which the restructured acoustic model is to be deployed. The restructured acoustic model also has L states. For each given one of the L mixture models in the reference acoustic model, a merge sequence is built which records, for a given cost function, sequential mergers of pairs of the components associated with the given one of the mixture models. A portion of the Nc components is assigned to each of the L states in the restructured acoustic model. The restructured acoustic model is built by, for each given one of the L states in the restructured acoustic model, applying the merge sequence to a corresponding one of the L mixture models in the reference acoustic model until the portion of the Nc components assigned to the given one of the L states is achieved.

Abstract(Chinese):

获得用于自动语音识别的大参考声学模型。大参考声学模型具有由L个混合模型建模的L状态，并且大的参考声学模型具有N个分量。识别在从参考声学模型导出的重构声学模型中使用的期望数量的小于N的分量Nc。基于要重新组织的声学模型要部署的计算环境来选择所需数量的分量Nc。重组的声学模型也有L个状态。对于参考声学模型中的每个给定的一个L混合模型，构建合并序列，其针对给定的成本函数记录与给定的混合模型相关联的成分对的顺序合并。 Nc分量的一部分被分配给重构的声学模型中的每个L状态。重构的声学模型由重构的声学模型中的每个给定的一个L状态构建，将合并序列应用于参考声学模型中的L个混合模型中的对应的一个，直到分配给给出了一个L状态。

Public/Granted literature

US20120150536A1 MODEL RESTRUCTURING FOR CLIENT AND SERVER BASED AUTOMATIC SPEECH RECOGNITION Public/Granted day:2012-06-14

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/14	..利用统计模型，例如隐马尔科夫模型〔HMM〕（G10L15/18优先）