Multistream acoustic models with dilations

Invention Grant

US11862146B2 Multistream acoustic models with dilations 有权

Please log in to see more content

Patent Title: Multistream acoustic models with dilations
Application No.: US16920081

Application Date: 2020-07-02
Publication No.: US11862146B2

Publication Date: 2024-01-02
Inventor: Kyu Jeong Han , Tao Ma , Daniel Povey
Applicant: ASAPP, INC.
Applicant Address: US NY New York
Assignee: ASAPP, INC.
Current Assignee: ASAPP, INC.
Current Assignee Address: US NY New York
Agency: GTC Law Group PC & Affiliates
Main IPC: G10L25/24
IPC: G10L25/24 ; G06N3/045 ; G10L15/16 ; G10L15/22 ; G10L15/06 ; G06N3/08 ; G06N3/048

Multistream acoustic models with dilations

Abstract:

Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.

Public/Granted literature

US20210005182A1 MULTISTREAM ACOUSTIC MODELS WITH DILATIONS Public/Granted day:2021-01-07

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/03	.以提取参数类型为特征的
G10L25/24	..提取参数的倒谱