Generic framework for large-margin MCE training in speech recognition

Invention Grant

US08423364B2 Generic framework for large-margin MCE training in speech recognition 有权

Title translation: 语言识别中大面积MCE培训的通用框架

Please log in to see more content

Patent Title: Generic framework for large-margin MCE training in speech recognition
Patent Title (中): 语言识别中大面积MCE培训的通用框架
Application No.: US11708440

Application Date: 2007-02-20
Publication No.: US08423364B2

Publication Date: 2013-04-16
Inventor: Dong Yu , Alejandro Acero , Li Deng , Xiaodong He
Applicant: Dong Yu , Alejandro Acero , Li Deng , Xiaodong He
Applicant Address: US WA Redmond
Assignee: Microsoft Corporation
Current Assignee: Microsoft Corporation
Current Assignee Address: US WA Redmond
Agency: Westman, Champlin & Kelly, P.A.
Main IPC: G10L15/14
IPC: G10L15/14 ; G10L15/00 ; G10L15/06

Generic framework for large-margin MCE training in speech recognition

Abstract:

A method and apparatus for training an acoustic model are disclosed. A training corpus is accessed and converted into an initial acoustic model. Scores are calculated for a correct class and competitive classes, respectively, for each token given the initial acoustic model. Also, a sample-adaptive window bandwidth is calculated for each training token. From the calculated scores and the sample-adaptive window bandwidth values, loss values are calculated based on a loss function. The loss function, which may be derived from a Bayesian risk minimization viewpoint, can include a margin value that moves a decision boundary such that token-to-boundary distances for correct tokens that are near the decision boundary are maximized. The margin can either be a fixed margin or can vary monotonically as a function of algorithm iterations. The acoustic model is updated based on the calculated loss values. This process can be repeated until an empirical convergence is met.

Abstract(Chinese):

公开了一种用于训练声学模型的方法和装置。训练语料库被访问并转换成初始声学模型。对于给定初始声学模型的每个令牌，分数计算分别为正确的类和竞争类。此外，针对每个训练令牌计算样本自适应窗口带宽。从计算出的分数和采样自适应窗口带宽值，根据损失函数计算损失值。可以从贝叶斯风险最小化观点导出的损失函数可以包括移动判定边界的边距值，使得靠近判定边界的正确令牌的令牌到边界的距离最大化。边距可以是固定边距，也可以作为算法迭代的函数单调变化。基于计算的损失值更新声学模型。可以重复该过程，直到满足经验收敛。

Public/Granted literature

US20080201139A1 Generic framework for large-margin MCE training in speech recognition Public/Granted day:2008-08-21

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/08	.语音分类或检索
G10L15/14	..利用统计模型，例如隐马尔科夫模型〔HMM〕（G10L15/18优先）