Streaming real-time automatic speech recognition service

Invention Grant

US10777186B1 Streaming real-time automatic speech recognition service 有权

Please log in to see more content

Patent Title: Streaming real-time automatic speech recognition service
Application No.: US16190047

Application Date: 2018-11-13
Publication No.: US10777186B1

Publication Date: 2020-09-15
Inventor: Stefano Stefani , Pramod Gurunath , Ashish Singh , Katrin Kirchoff , Deepikaa Suresh , Varun Sembium Varadarajan , Vasanth Philomin , Vikram Sathyanarayana Anbazhagan , Pu Paul Zhao , Vijit Gupta , Ruoyu Huang
Applicant: Amazon Technologies, Inc.
Applicant Address: US WA Seattle
Assignee: Amazon Technolgies, Inc.
Current Assignee: Amazon Technolgies, Inc.
Current Assignee Address: US WA Seattle
Agency: Nicholson De Vos Webster & Elliott LLP
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/06 ; G10L25/78 ; G10L15/30 ; G10L15/04 ; G10L15/26 ; G10L15/183

Streaming real-time automatic speech recognition service

Abstract:

Techniques for streaming real-time automated speech recognition (ASR) are described. A user can stream audio data to a frontend service of the ASR service. The frontend service can establish a bi-directional connection to an audio decoder host to perform ASR on the data stream. The audio decoder host may include a streaming ASR engine which can analyze chunks of the audio data stream using an acoustic model to divide the audio data into words, and a language model to identify sentences made of the words spoken in the audio file. The acoustic model can be trained using short audio sentence data (e.g., on the order of 30 seconds to a few minutes), enabling the transcription service to accurately transcribe short chunks of audio data. The results are then punctuated and normalized. The resulting transcript is then streamed back to the user over the bi-directional connection.

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）