ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION
    1.
    发明申请
    ADAPTIVE AUDIO ENHANCEMENT FOR MULTICHANNEL SPEECH RECOGNITION 审中-公开
    多声道语音识别的自适应音频增强

    公开(公告)号:WO2017164954A1

    公开(公告)日:2017-09-28

    申请号:PCT/US2016/068800

    申请日:2016-12-28

    Applicant: GOOGLE INC.

    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for neural network adaptive beamforming for multichannel speech recognition are disclosed. In one aspect, a method includes the actions of receiving a first channel of audio data corresponding to an utterance and a second channel of audio data corresponding to the utterance. The actions further include generating a first set of filter parameters for a first filter based on the first channel of audio data and the second channel of audio data and a second set of filter parameters for a second filter based on the first channel of audio data and the second channel of audio data. The actions further include generating a single combined channel of audio data. The actions further include inputting the audio data to a neural network. The actions further include providing a transcription for the utterance.

    Abstract translation: 公开了用于多声道语音识别的神经网络自适应波束成形的方法,系统和装置,包括编码在计算机存储介质上的计算机程序。 在一个方面,一种方法包括以下动作:接收对应于话语的第一音频数据通道和对应于话语的音频数据的第二通道。 该动作还包括基于音频数据的第一声道和第二声道数据生成用于第一滤波器的第一组滤波器参数,并基于第一声道数据生成用于第二滤波器的第二组滤波器参数,以及 音频数据的第二个通道。 该动作还包括生成音频数据的单个组合频道。 该动作还包括将音频数据输入到神经网络。 这些行动还包括为话语提供转录。

Patent Agency Ranking