Online maximum-likelihood mean and variance normalization for speech recognition

Invention Grant

US08996368B2 Online maximum-likelihood mean and variance normalization for speech recognition 有权

Title translation: 语音识别的在线最大似然均值和方差归一化

Please log in to see more content

Patent Title: Online maximum-likelihood mean and variance normalization for speech recognition
Patent Title (中): 语音识别的在线最大似然均值和方差归一化
Application No.: US13518405

Application Date: 2010-02-22
Publication No.: US08996368B2

Publication Date: 2015-03-31
Inventor: Daniel Willett
Applicant: Daniel Willett
Applicant Address: US MA Burlington
Assignee: Nuance Communications, Inc.
Current Assignee: Nuance Communications, Inc.
Current Assignee Address: US MA Burlington
Agency: Banner & Witcoff, Ltd.
International Application: PCT/US2010/024890 WO 20100222
International Announcement: WO2011/102842 WO 20110825
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/02 ; G10L15/20 ; G10L15/08 ; G10L15/34

Online maximum-likelihood mean and variance normalization for speech recognition

Abstract:

A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.

Abstract(Chinese):

描述用于语音识别的特征变换。处理输入语音话语以产生代表性语音向量的序列。使用解码搜索来执行时间同步语音识别遍，以确定对应于语音输入的识别输出。解码搜索包括对于在一些第一阈值数量的语音矢量之后的每个语音矢量，基于解码搜索的发音和部分解码结果中的先前语音向量来估计特征变换。然后基于当前特征变换来调整当前语音向量，并且在解码搜索的当前帧中使用经调整的语音向量。

Public/Granted literature

US20120259632A1 Online Maximum-Likelihood Mean and Variance Normalization for Speech Recognition Public/Granted day:2012-10-11

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）