Large margin training for attention-based end-to-end speech recognition

Invention Grant

US10861441B2 Large margin training for attention-based end-to-end speech recognition 有权

Please log in to see more content

Patent Title: Large margin training for attention-based end-to-end speech recognition
Application No.: US16276081

Application Date: 2019-02-14
Publication No.: US10861441B2

Publication Date: 2020-12-08
Inventor: Peidong Wang , Jia Cui , Chao Weng , Dong Yu
Applicant: Tencent America LLC
Applicant Address: US CA Palo Alto
Assignee: TENCENT AMERICA LLC
Current Assignee: TENCENT AMERICA LLC
Current Assignee Address: US CA Palo Alto
Agency: Sughrue Mion, PLLC
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/06 ; G10L15/30

Large margin training for attention-based end-to-end speech recognition

Abstract:

A method of attention-based end-to-end (E2E) automatic speech recognition (ASR) training, includes performing cross-entropy training of a model, based on one or more input features of a speech signal, performing beam searching of the model of which the cross-entropy training is performed, to generate an n-best hypotheses list of output hypotheses, and determining a one-best hypothesis among the generated n-best hypotheses list. The method further includes determining a character-based gradient and a word-based gradient, based on the model of which the cross-entropy training is performed and a loss function in which a distance between a reference sequence and the determined one-best hypothesis is maximized, and performing backpropagation of the determined character-based gradient and the determined word-based gradient to the model, to update the model.

Public/Granted literature

US20200265831A1 LARGE MARGIN TRAINING FOR ATTENTION-BASED END-TO-END SPEECH RECOGNITION Public/Granted day:2020-08-20

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）