Method, apparatus, device and computer readable storage medium for recognizing and decoding voice based on streaming attention model

Invention Grant

US11355113B2 Method, apparatus, device and computer readable storage medium for recognizing and decoding voice based on streaming attention model 有权

Please log in to see more content

Patent Title: Method, apparatus, device and computer readable storage medium for recognizing and decoding voice based on streaming attention model
Application No.: US16813271

Application Date: 2020-03-09
Publication No.: US11355113B2

Publication Date: 2022-06-07
Inventor: Junyao Shao , Sheng Qian , Lei Jia
Applicant: Baidu Online Network Technology (Beijing) Co., Ltd.
Applicant Address: CN Beijing
Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
Current Assignee: Baidu Online Network Technology (Beijing) Co., Ltd.
Current Assignee Address: CN Beijing
Agency: Nixon Peabody LLP
Priority: CN201910646762.1 20190717
Main IPC: G10L15/22
IPC: G10L15/22 ; G10L15/197 ; G10L15/02 ; G10L15/32

Method, apparatus, device and computer readable storage medium for recognizing and decoding voice based on streaming attention model

Abstract:

A method, apparatus, device, and computer readable storage medium for recognizing and decoding a voice based on a streaming attention model are provided. The method may include generating a plurality of acoustic paths for decoding the voice using the streaming attention model, and then merging acoustic paths with identical last syllables of the plurality of acoustic paths to obtain a plurality of merged acoustic paths. The method may further include selecting a preset number of acoustic paths from the plurality of merged acoustic paths as retained candidate acoustic paths. Embodiments of the present disclosure present a concept that acoustic score calculating of a current voice fragment is only affected by its last voice fragment and has nothing to do with earlier voice history, and merge acoustic paths with the identical last syllables of the plurality of candidate acoustic paths.

Public/Granted literature

US20210020175A1 METHOD, APPARATUS, DEVICE AND COMPUTER READABLE STORAGE MEDIUM FOR RECOGNIZING AND DECODING VOICE BASED ON STREAMING ATTENTION MODEL Public/Granted day:2021-01-21

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/22	.在语音识别过程中（例如在人机对话过程中）使用的程序