Invention Grant
US09378735B1 Estimating speaker-specific affine transforms for neural network based speech recognition systems 有权
基于神经网络的语音识别系统估计说话人特定的仿射变换

Estimating speaker-specific affine transforms for neural network based speech recognition systems
Abstract:
Features are disclosed for estimating affine transforms in Log Filter-Bank Energy Space (“LFBE” space) in order to adapt artificial neural network-based acoustic models to a new speaker or environment. Neural network-based acoustic models may be trained using concatenated LFBEs as input features. The affine transform may be estimated by minimizing the least squares error between corresponding linear and bias transform parts for the resultant neural network feature vector and some standard speaker-specific feature vector obtained for a GMM-based acoustic model using constrained Maximum Likelihood Linear Regression (“cMLLR”) techniques. Alternatively, the affine transform may be estimated by minimizing the least squares error between the resultant transformed neural network feature and some standard speaker-specific feature obtained for a GMM-based acoustic model.
Information query
Patent Agency Ranking
0/0