Adaptive audio enhancement for multichannel speech recognition

Invention Grant

US11756534B2 Adaptive audio enhancement for multichannel speech recognition 有权

Please log in to see more content

Patent Title: Adaptive audio enhancement for multichannel speech recognition
Application No.: US17649058

Application Date: 2022-01-26
Publication No.: US11756534B2

Publication Date: 2023-09-12
Inventor: Bo Li , Ron Weiss , Michiel A. U. Bacchiani , Tara N. Sainath , Kevin William Wilson
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Agency: Honigman LLP
Agent Brett A. Krueger; Grant J. Griffith
Main IPC: G10L15/00
IPC: G10L15/00 ; G10L15/16 ; G10L15/20 ; G10L21/0224 ; G10L15/26 ; G10L21/0216

Adaptive audio enhancement for multichannel speech recognition

Abstract:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

Public/Granted literature

US20220148582A1 ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION Public/Granted day:2022-05-12

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）