Multi-stream target-speech detection and channel fusion

Invention Grant

US11158333B2 Multi-stream target-speech detection and channel fusion 有权

Please log in to see more content

Patent Title: Multi-stream target-speech detection and channel fusion
Application No.: US16706519

Application Date: 2019-12-06
Publication No.: US11158333B2

Publication Date: 2021-10-26
Inventor: Francesco Nesta , Saeed Mosayyebpour Kaskari
Applicant: SYNAPTICS INCORPORATED
Applicant Address: US CA San Jose
Assignee: SYNAPTICS INCORPORATED
Current Assignee: SYNAPTICS INCORPORATED
Current Assignee Address: US CA San Jose
Agency: Paradice & Li LLP
Main IPC: G10L21/0364
IPC: G10L21/0364 ; G10L25/60 ; G10L15/22 ; G10L25/84 ; H04R1/40 ; H04R3/00 ; H04S3/00 ; H04L29/06

Multi-stream target-speech detection and channel fusion

Abstract:

Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.

Public/Granted literature

US20200184985A1 MULTI-STREAM TARGET-SPEECH DETECTION AND CHANNEL FUSION Public/Granted day:2020-06-11

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L21/00	为了改变语音或声音信号的质量或其可识度而处理语音或声音信号，以产生另一种可听的或非可听的信号，例如视觉信号或触觉信号（G10L19/00优先）
G10L21/02	.语音增强，例如降低噪声或消除回声（在直线传送系统中减轻回声效应入H04B3/20；免提电话中的回声抑制入H04M9/08）
G10L21/0316	..通过改变振幅
G10L21/0364	...用于提高可识度