-
公开(公告)号:US11694710B2
公开(公告)日:2023-07-04
申请号:US17484208
申请日:2021-09-24
Applicant: Synaptics Incorporated
Inventor: Francesco Nesta , Saeed Mosayyebpour Kaskari
CPC classification number: G10L21/0364 , G10L15/22 , G10L25/60 , G10L25/84 , H04R1/406 , H04R3/005 , H04S3/008 , H04L65/60 , H04S2400/01
Abstract: Audio processing systems and methods include an audio sensor array configured to receive a multichannel audio input and generate a corresponding multichannel audio signal and target-speech detection logic and an automatic speech recognition engine or VoIP application. An audio processing device includes a target speech enhancement engine configured to analyze a multichannel audio input signal and generate a plurality of enhanced target streams, a multi-stream target-speech detection generator comprising a plurality of target-speech detector engines each configured to determine a probability of detecting a specific target-speech of interest in the stream, wherein the multi-stream target-speech detection generator is configured to determine a plurality of weights associated with the enhanced target streams, and a fusion subsystem configured to apply the plurality of weights to the enhanced target streams to generate an enhancement output signal.
-
公开(公告)号:US11373667B2
公开(公告)日:2022-06-28
申请号:US15957829
申请日:2018-04-19
Applicant: SYNAPTICS INCORPORATED
Inventor: Saeed Mosayyebpour Kaskari , Francesco Nesta , Trausti Thormundsson , Thomas Aaron Gulliver
IPC: H04B15/00 , G10L21/0232 , G10L21/038 , G10L21/0208 , G10L25/18 , G10L21/00
Abstract: Systems and methods for processing an audio signal include an audio input operable to receive an input signal comprising a time-domain, single-channel audio signal, a subband analysis block operable to transform the input signal to a frequency domain input signal comprising a plurality of k-spaced under-sampled subband signals, a reverberation reduction block operable to reduce reverberation effect, including late reverberation, in the plurality of k-spaced under-sampled subband signals, a noise reduction block operable to reduce background noise from the plurality of k-spaced under-sampled subband signals, and a subband synthesis block operable to transform the subband signals to the time-domain, thereby producing an enhanced output signal.
-
公开(公告)号:US11264017B2
公开(公告)日:2022-03-01
申请号:US16900790
申请日:2020-06-12
Applicant: SYNAPTICS INCORPORATED
Inventor: Alireza Masnadi-Shirazi , Francesco Nesta
Abstract: Systems and methods include a plurality of audio input components configured to generate a plurality of audio input signals, and a logic device configured to receive the plurality of audio input signals, determine whether the plurality of audio signals comprise target audio associated with an audio source, estimate a relative location of the audio source with respect to the plurality of audio input components based on the plurality of audio signals and a determination of whether the plurality of audio signals comprise the target audio, and process the plurality of audio signals to generate an audio output signal by enhancing the target audio based on the estimated relative location. The logic device is further configured to use relative transfer-based covariance to construct directional covariance matrix aligned across frequency bands and find a direction that minimizes beam power subject to distortionless criteria.
-
公开(公告)号:US20210219053A1
公开(公告)日:2021-07-15
申请号:US16740297
申请日:2020-01-10
Applicant: SYNAPTICS INCORPORATED
Inventor: Alireza Masnadi-Shirazi , Francesco Nesta
Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
-
公开(公告)号:US10762427B2
公开(公告)日:2020-09-01
申请号:US15909930
申请日:2018-03-01
Applicant: SYNAPTICS INCORPORATED
Inventor: Saeed Mosayyebpour Kaskari , Trausti Thormundsson , Francesco Nesta
Abstract: Classification training systems and methods include a neural network for classification of input data, a training dataset providing segmented labeled training data, and a classification training module operable to train the neural network using the training data. A forward pass processing module is operable to generate neural network outputs for the training data using weights and bias for the neural network, and a backward pass processing module is operable to update the weights and biases in a backward pass, including obtaining Region of Target (ROT) information from the training data, generate a forward-backward masking based on the ROT information, the forward-backward masking placing at least one restriction on a neural network output path, compute modified forward and backward variables based on the neural network outputs and the forward-backward masking, and update the weights and biases.
-
6.
公开(公告)号:US20200219530A1
公开(公告)日:2020-07-09
申请号:US16735575
申请日:2020-01-06
Applicant: SYNAPTICS INCORPORATED
Inventor: Francesco Nesta , Alireza Masnadi-Shirazi
IPC: G10L25/84 , H04R1/40 , H04R3/00 , H04R5/027 , G10L25/18 , G10L15/16 , G10L25/21 , G10L15/22 , G06N3/08
Abstract: Systems and methods include a first voice activity detector operable to detect speech in a frame of a multichannel audio input signal and output a speech determination, a constrained minimum variance adaptive filter operable to receive the multichannel audio input signal and the speech determination and minimize a signal variance at the output of the filter, thereby producing an equalized target speech signal, a mask estimator operable to receive the equalized target speech signal and the speech determination and generate a spectral-temporal mask to discriminate a target speech from noise and interference speech, and a second activity voice detector operable to detect voice in a frame of the speech discriminated signal. An audio input sensor array including a plurality of microphones, each microphone generating a channel of the multichannel audio input signal. A sub-band analysis module operable to decompose each of the channels into a plurality of frequency sub-bands.
-
公开(公告)号:US20180268798A1
公开(公告)日:2018-09-20
申请号:US15922848
申请日:2018-03-15
Applicant: SYNAPTICS INCORPORATED
CPC classification number: G10K11/16 , G10K11/1785 , G10K11/17885 , G10K2210/1081 , G10K2210/3025 , G10K2210/3028 , G10K2210/3035 , G10K2210/3056 , G10L21/0232 , G10L25/78 , G10L2021/02165 , H04M1/6058 , H04R1/08 , H04R1/1083 , H04R3/005 , H04R3/04 , H04R2201/107 , H04R2410/05 , H04R2430/03
Abstract: Systems and methods for enhancing a headset user's own voice include an outside microphone, an inside microphone, audio input components operable to receive a plurality of time-domain microphone signals, including an outside microphone signal from the outside microphone and an inside microphone signal from the inside microphone, a subband decomposition module operable to transform the time-domain microphone signals to frequency domain subband signals, a voice activity detector operable to detect speech presence and absence in the subband signals, a speech extraction module operable to predict a clean speech signal in each of the inside microphone signal and the outside microphone signal, and cancel audio sources other than a headset user's own voice by combining the predicted clean speech signal from the inside microphone signal and the predicted clean speech signal from the outside microphone signal, and a postfiltering module operable to reduce residual noise.
-
8.
公开(公告)号:US11763832B2
公开(公告)日:2023-09-19
申请号:US16865111
申请日:2020-05-01
Applicant: Synaptics Incorporated , THE TRUSTEES OF INDIANA UNIVERSITY
Inventor: Francesco Nesta , Minje Kim , Sanna Wager
IPC: G10L21/00 , G10L15/16 , G10L21/0264 , G10L21/0216 , G06N3/08
CPC classification number: G10L21/0264 , G06N3/08 , G10L21/0216
Abstract: Systems and methods for generating an enhanced audio signal comprise a trained neural network configured to receive an input audio signal and generate an enhanced target signal, the trained neural network comprising a pre-processing neural network configured to receive a segment of the input audio signal and output an audio classification, the pre-processing neural network including at least one hidden layer comprising an embedding vector, and a noise reduction neural network configured to receive the segment of the input audio signal, and the embedding vector and generate the enhanced target signal. The pre-processing neural network may comprise a target signal pre-processing neural network configured to output a target signal classification and comprising at least one hidden layer comprising a target embedding vector. The pre-processing neural network may comprise a noise pre-processing neural network configured output a noise classification and comprising at least one hidden layer comprising a noise embedding vector.
-
公开(公告)号:US20210314701A1
公开(公告)日:2021-10-07
申请号:US17349589
申请日:2021-06-16
Applicant: Synaptics Incorporated
Inventor: Alireza Masnadi-Shirazi , Francesco Nesta
Abstract: Embodiments described herein provide a combined multi-source time difference of arrival (TDOA) tracking and voice activity detection (VAD) mechanism that is applicable for generic array geometries, e.g., a microphone array that lies on a plane. The combined multi-source TDOA tracking and VAD mechanism scans the azimuth and elevation angles of the microphone array in microphone pairs, based on which a planar locus of physically admissible TDOAs can be formed in the multi-dimensional TDOA space of multiple microphone pairs. In this way, the multi-dimensional TDOA tracking reduces the number of calculations that was usually involved in traditional TDOA by performing the TDOA search for each dimension separately.
-
公开(公告)号:US10762417B2
公开(公告)日:2020-09-01
申请号:US15894872
申请日:2018-02-12
Applicant: SYNAPTICS INCORPORATED
Inventor: Saeed Mosayyebpour Kaskari , Trausti Thormundsson , Francesco Nesta
Abstract: A classification system and method for training a neural network includes receiving a stream of segmented, labeled training data having a sequence of frames, computing a stream of input features data for the sequence of frames, and generating neural network outputs for the sequence of frames in a forward pass through the training data and in accordance weights and biases. The weights and biases are updated in a backward pass through the training data, including determining Region of Target (ROT) information from the segmented, labeled training data, computing modified forward and backward variables based on the neural network outputs and the ROT information, deriving a signal error for each frame within the sequence of frames based on the modified forward and backward variables, and updating the weights and biases based on the derived signal error. An adaptive learning module is provided to improve a convergence rate of the neural network.
-
-
-
-
-
-
-
-
-