METHOD AND SYSTEM FOR MUTLIPLE TIME RESOLUTION AUDIO PROCESSING

    公开(公告)号:US20250061911A1

    公开(公告)日:2025-02-20

    申请号:US18450784

    申请日:2023-08-16

    Abstract: Aspects of the present disclosure provided a method for voice control that includes transforming, using a short-time Fourier transform (STFT) applied to data in each window aligned across each input channel of the multichannel audio stream, the multichannel audio stream into a complex valued frequency-domain representation. For a current window, the method further includes: updating a first complex-valued covariance matrix corresponding to a slowly-adapting beamformer and forming a single-channel denoised estimate for each frequency band in the STFT; calculating a voice activity detection (VAD) estimate for each frequency band in the STFT by comparing a magnitude of the single-channel denoised estimate to a magnitude of each input channel of the multichannel audio stream; and selectively updating or refraining from updating, responsive to the VAD estimate respectively indicating a presence or an absence of speech, a second complex-valued covariance matrix corresponding to a quickly-adapting beamformer.

Patent Agency Ranking