Deep learning models for speech recognition

Invention Grant

US11562733B2 Deep learning models for speech recognition 有权

Please log in to see more content

Patent Title: Deep learning models for speech recognition
Application No.: US16542243

Application Date: 2019-08-15
Publication No.: US11562733B2

Publication Date: 2023-01-24
Inventor: Awni Hannun , Carl Case , Jared Casper , Bryan Catanzaro , Gregory Diamos , Erich Eisen , Ryan Prenger , Sanjeev Satheesh , Shubhabrata Sengupta , Adam Coates , Andrew Ng
Applicant: BAIDU USA LLC
Applicant Address: US CA Sunnyvale
Assignee: BAIDU USA LLC
Current Assignee: BAIDU USA LLC
Current Assignee Address: US CA Sunnyvale
Agency: North Weber & Baugh LLP
Main IPC: G10L15/06
IPC: G10L15/06 ; G10L15/26 ; G10L15/16 ; G06N3/04 ; G06N3/08

Deep learning models for speech recognition

Abstract:

Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. Neither a phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained. Embodiments of the system can also handle challenging noisy environments better than widely used, state-of-the-art commercial speech systems.

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/06	.创建基准模板；训练语音识别系统，例如对说话者声音特征的适应（G10L15/14优先）