Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network

Invention Grant

US10347271B2 Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network 有权

Please log in to see more content

Patent Title: Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network
Application No.: US15368452

Application Date: 2016-12-02
Publication No.: US10347271B2

Publication Date: 2019-07-09
Inventor: Francesco Nesta , Xiangyuan Zhao , Trausti Thormundsson
Applicant: SYNAPTICS INCORPORATED
Applicant Address: US CA San Jose
Assignee: SYNAPTICS INCORPORATED
Current Assignee: SYNAPTICS INCORPORATED
Current Assignee Address: US CA San Jose
Agency: Haynes and Boone, LLP
Main IPC: G10L25/78
IPC: G10L25/78 ; H04R3/00 ; G10L21/0216 ; G10L21/0208 ; G10L25/30 ; G10L21/0272

Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network

Abstract:

Various techniques are provided to perform enhanced automatic speech recognition. For example, a subband analysis may be performed that transforms time-domain signals of multiple audio channels in subband signals. An adaptive configurable transformation may also be performed to produce single or multichannel-based features whose values are correlated to an Ideal Binary Mask (IBM). An unsupervised Gaussian Mixture Model (GMM) model fitting the distribution of the features and producing posterior probabilities may also be performed, and the posteriors may be combined to produce deep neural network (DNN) feature vectors. A DNN may be provided that predicts oracle spectral gains from the input feature vectors. Spectral processing may be performed to produce an estimate of the target source time-frequency magnitudes from the mixtures and the output of the DNN. Subband synthesis may be performed to transform signals back to time-domain.

Public/Granted literature

US20170162194A1 SEMI-SUPERVISED SYSTEM FOR MULTICHANNEL SOURCE ENHANCEMENT THROUGH CONFIGURABLE ADAPTIVE TRANSFORMATIONS AND DEEP NEURAL NETWORK Public/Granted day:2017-06-08

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L25/00	不限于组G10L 15/00-G10L 21/00的语言或者声音分析技术(当利用语音检测器来感知一些信号特殊特征的基于半导体的静噪放大器，如无信号时的感知入H03G3/34)
G10L25/78	.语音信号存在或不存在的检测（在双向扩音电话系统中通过语音频率切换传输的方向入H04M9/10）