Source separation for automatic speech recognition (ASR)
Abstract:
A method for speech enhancement, the method may include receiving or generating sound samples that represent sound signals that were received during a given time period by an array of microphones; frequency transforming the sound samples to provide frequency-transformed samples; clustering the frequency-transformed samples to speakers to provide speaker related clusters, wherein the clustering is based on (i) spatial cues related to the received sound signals and (ii) acoustic cues related to the speakers; determining a relative transfer function for each speaker of the speakers to provide speakers related relative transfer functions; applying a multiple multiple output (MIMO) beamforming operation on the speakers related relative transfer functions to provide beamformed signals; and inverse-frequency transforming the beamformed signals to provide speech signals.
Public/Granted literature
Information query
Patent Agency Ranking
0/0