Invention Grant
- Patent Title: Semi-supervised system for multichannel source enhancement through configurable unsupervised adaptive transformations and supervised deep neural network
-
Application No.: US15368452Application Date: 2016-12-02
-
Publication No.: US10347271B2Publication Date: 2019-07-09
- Inventor: Francesco Nesta , Xiangyuan Zhao , Trausti Thormundsson
- Applicant: SYNAPTICS INCORPORATED
- Applicant Address: US CA San Jose
- Assignee: SYNAPTICS INCORPORATED
- Current Assignee: SYNAPTICS INCORPORATED
- Current Assignee Address: US CA San Jose
- Agency: Haynes and Boone, LLP
- Main IPC: G10L25/78
- IPC: G10L25/78 ; H04R3/00 ; G10L21/0216 ; G10L21/0208 ; G10L25/30 ; G10L21/0272

Abstract:
Various techniques are provided to perform enhanced automatic speech recognition. For example, a subband analysis may be performed that transforms time-domain signals of multiple audio channels in subband signals. An adaptive configurable transformation may also be performed to produce single or multichannel-based features whose values are correlated to an Ideal Binary Mask (IBM). An unsupervised Gaussian Mixture Model (GMM) model fitting the distribution of the features and producing posterior probabilities may also be performed, and the posteriors may be combined to produce deep neural network (DNN) feature vectors. A DNN may be provided that predicts oracle spectral gains from the input feature vectors. Spectral processing may be performed to produce an estimate of the target source time-frequency magnitudes from the mixtures and the output of the DNN. Subband synthesis may be performed to transform signals back to time-domain.
Public/Granted literature
Information query