Invention Grant
US09378735B1 Estimating speaker-specific affine transforms for neural network based speech recognition systems
有权
基于神经网络的语音识别系统估计说话人特定的仿射变换
- Patent Title: Estimating speaker-specific affine transforms for neural network based speech recognition systems
- Patent Title (中): 基于神经网络的语音识别系统估计说话人特定的仿射变换
-
Application No.: US14135474Application Date: 2013-12-19
-
Publication No.: US09378735B1Publication Date: 2016-06-28
- Inventor: Sri Venkata Surya Siva Rama Krishna Garimella , Bjorn Hoffmeister , Nikko Strom
- Applicant: Amazon Technologies, Inc.
- Applicant Address: US WA Seattle
- Assignee: Amazon Technologies, Inc.
- Current Assignee: Amazon Technologies, Inc.
- Current Assignee Address: US WA Seattle
- Agency: Knobbe, Martens, Olson & Bear, LLP
- Main IPC: G10L15/16
- IPC: G10L15/16 ; G10L15/06 ; G10L15/20 ; G10L13/08

Abstract:
Features are disclosed for estimating affine transforms in Log Filter-Bank Energy Space (“LFBE” space) in order to adapt artificial neural network-based acoustic models to a new speaker or environment. Neural network-based acoustic models may be trained using concatenated LFBEs as input features. The affine transform may be estimated by minimizing the least squares error between corresponding linear and bias transform parts for the resultant neural network feature vector and some standard speaker-specific feature vector obtained for a GMM-based acoustic model using constrained Maximum Likelihood Linear Regression (“cMLLR”) techniques. Alternatively, the affine transform may be estimated by minimizing the least squares error between the resultant transformed neural network feature and some standard speaker-specific feature obtained for a GMM-based acoustic model.
Information query