Multi-task training architecture and strategy for attention-based speech recognition system

Invention Grant

US11257481B2 Multi-task training architecture and strategy for attention-based speech recognition system 有权

Please log in to see more content

Patent Title: Multi-task training architecture and strategy for attention-based speech recognition system
Application No.: US16169512

Application Date: 2018-10-24
Publication No.: US11257481B2

Publication Date: 2022-02-22
Inventor: Jia Cui , Chao Weng , Guangsen Wang , Jun Wang , Chengzhu Yu , Dan Su , Dong Yu
Applicant: TENCENT AMERICA LLC
Applicant Address: US CA Palo Alto
Assignee: TENCENT AMERICA LLC
Current Assignee: TENCENT AMERICA LLC
Current Assignee Address: US CA Palo Alto
Agency: Sughrue Mion, PLLC
Main IPC: G10L15/06
IPC: G10L15/06 ; G10L25/03 ; G10L25/54 ; G10L15/10

Multi-task training architecture and strategy for attention-based speech recognition system

Abstract:

Methods and apparatuses are provided for performing sequence to sequence (Seq2Seq) speech recognition training performed by at least one processor. The method includes acquiring a training set comprising a plurality of pairs of input data and target data corresponding to the input data, encoding the input data into a sequence of hidden states, performing a connectionist temporal classification (CTC) model training based on the sequence of hidden states, performing an attention model training based on the sequence of hidden states, and decoding the sequence of hidden states to generate target labels by independently performing the CTC model training and the attention model training.

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）