Invention Grant
- Patent Title: Multistream acoustic models with dilations
-
Application No.: US16920081Application Date: 2020-07-02
-
Publication No.: US11862146B2Publication Date: 2024-01-02
- Inventor: Kyu Jeong Han , Tao Ma , Daniel Povey
- Applicant: ASAPP, INC.
- Applicant Address: US NY New York
- Assignee: ASAPP, INC.
- Current Assignee: ASAPP, INC.
- Current Assignee Address: US NY New York
- Agency: GTC Law Group PC & Affiliates
- Main IPC: G10L25/24
- IPC: G10L25/24 ; G06N3/045 ; G10L15/16 ; G10L15/22 ; G10L15/06 ; G06N3/08 ; G06N3/048

Abstract:
Audio signals of speech may be processed using an acoustic model. An acoustic model may be implemented with multiple streams of processing where different streams perform processing using different dilation rates. For example, a first stream may process features of the audio signal with one or more convolutional neural network layers having a first dilation rate, and a second stream may process features of the audio signal with one or more convolutional neural network layers having a second dilation rate. Each stream may compute a stream vector, and the stream vectors may be combined to a vector of speech unit scores, where the vector of speech unit scores provides information about the acoustic content of the audio signal. The vector of speech unit scores may be used for any appropriate application of speech, such as automatic speech recognition.
Public/Granted literature
- US20210005182A1 MULTISTREAM ACOUSTIC MODELS WITH DILATIONS Public/Granted day:2021-01-07
Information query