Abstract:
A signal processor comprising a plurality of microphone-terminals configured to receive a respective plurality of microphone-signals. A plurality of beamforming-modules, each respective beamforming-module configured to receive and process input-signalling representative of some or all of the plurality of microphone-signals to provide a respective speech-reference-signal, a respective noise-reference-signal, and a beamformer output signal based on focusing a beam into a respective angular direction. A beam-selection-module comprising a plurality of speech-leakage-estimation-modules, each respective speech-leakage-estimation-module configured to receive the speech-reference-signal and the noise-reference-signal from a respective one of the plurality of beamforming-modules; and provide a respective speech-leakage-estimation-signal based on a similarity measure of the received speech-reference-signal with respect to the received noise-reference-signal. The beam-selection-module further comprises a beam-selection-controller configured to provide a control-signal based on the speech-leakage-estimation-signals.
Abstract:
The use of a data link between two or more smart devices for voice communication allows for the enhancement of voice quality in a collaborative way through the exchange of well-defined meta-data between the smart devices. The meta-data may be exchanged on a separate IP data link or as part of the exchanged voice data packets.
Abstract:
A signal processor comprising: a modelling block, configured to receive a frequency-domain-input-signal, a fundamental-frequency-signal representative of a fundamental frequency of the frequency-domain-input-signal; and configured to provide a pitch-model-signal based on a periodic function, the pitch-model-signal spanning a plurality of discrete frequency bins, each discrete frequency bin having a respective discrete frequency bin index, wherein within each discrete frequency bin the pitch-model-signal is defined by: the periodic function; the fundamental frequency; the frequency-domain-input-signal; and the respective discrete frequency bin index. The signal processor further comprises a manipulation block, configured to provide an output-signal based on the frequency-domain-input-signal and the pitch-model-signal.
Abstract:
A signal processor comprising a plurality of microphone-terminals configured to receive a respective plurality of microphone-signals. A plurality of beamforming-modules, each respective beamforming-module configured to receive and process input-signalling representative of some or all of the plurality of microphone-signals to provide a respective speech-reference-signal, a respective noise-reference-signal, and a beamformer output signal based on focusing a beam into a respective angular direction. A beam-selection-module comprising a plurality of speech-leakage-estimation-modules, each respective speech-leakage-estimation-module configured to receive the speech-reference-signal and the noise-reference-signal from a respective one of the plurality of beamforming-modules; and provide a respective speech-leakage-estimation-signal based on a similarity measure of the received speech-reference-signal with respect to the received noise-reference-signal. The beam-selection-module further comprises a beam-selection-controller configured to provide a control-signal based on the speech-leakage-estimation-signals.
Abstract:
The use of a data link between two or more smart devices for voice communication allows for the enhancement of voice quality in a collaborative way through the exchange of well-defined meta-data between the smart devices. The meta-data may be exchanged on a separate IP data link or as part of the exchanged voice data packets.
Abstract:
A signal processor comprising: an input terminal, configured to receive an input-signal; a voicing-terminal, configured to receive a voicing-signal representative of a voiced speech component of the input-signal; an output terminal; a delay block, configured to receive the input-signal and provide a filter-input-signal as a delayed representation of the input-signal; a filter block, configured to: receive the filter-input-signal; and provide a noise-estimate-signal by filtering the filter-input-signal; a combiner block, configured to: receive a combiner-input-signal representative of the input-signal; receive the noise-estimate-signal; and combine the combiner-input-signal with the noise-estimate-signal to provide an output-signal to the output terminal; and a filter-control-block, configured to: receive the voicing-signal; receive signalling representative of the input-signal; and set filter coefficients of the filter block in accordance with the voicing-signal and the input-signal.
Abstract:
A signal processor for performing signal enhancement, the signal processor comprising: an input-terminal, configured to receive an input-signaling; an output-terminal; an interference-cancellation-block configured to receive the input-signaling and to provide an interference-estimate-signaling and an interference-cancelled-signal based on the input-signaling. The signal processor further comprises a feature-block configured to provide a combination-feature-signal based on the interference-cancelled-signal and the interference-estimate-signaling; and a neural-network-block configured to apply model parameters to the combination-feature-signal to provide a neural-network-output-signal to the output-terminal.
Abstract:
A speech-signal-processing-circuit configured to receive a time-frequency-domain-reference-speech-signal and a time-frequency-domain-degraded-speech-signal. The time-frequency-domain-reference-speech-signal comprises: an upper-band-reference-component with frequencies that are greater than a frequency-threshold-value; and a lower-band-reference-component with frequencies that are less than the frequency-threshold-value. The time-frequency-domain-degraded-speech-signal comprises: an upper-band-degraded-component with frequencies that are greater than the frequency-threshold-value; and a lower-band-degraded-component with frequencies that are less than the frequency-threshold-value. The speech-signal-processing-circuit comprises: a disturbance calculator configured to determine one or more SBR-features based on the time-frequency-domain-reference-speech-signal and the time-frequency-domain-degraded-speech-signal by: for each of a plurality of frames: determining a reference-ratio based on the ratio of (i) the upper-band-reference-component to (ii) the lower-band-reference-component; determining a degraded-ratio based on the ratio of (i) the upper-band-degraded-component to (ii) the lower-band-degraded-component; and determining a spectral-balance-ratio based on the ratio of the reference-ratio to the degraded-ratio; and (ii) determining the one or more SBR-features based on the spectral-balance-ratio for the plurality of frames.
Abstract:
A signal processor comprising: a signal-manipulation-block configured to: receive a cepstrum-input-signal, wherein the cepstrum-input-signal is in the cepstrum domain and comprises a plurality of bins; receive a pitch-bin-identifier that is indicative of a pitch-bin in the cepstrum-input-signal; and generate a cepstrum-output-signal based on the cepstrum-input-signal by: scaling the pitch-bin relative to one or more of the other bins of the cepstrum-input-signal; or determining an output-pitch-bin-value based on the pitch-bin, and setting one or more of the other bins of the cepstrum-input-signal to a predefined value; or determining an output-other-bin-value based on one or more of the other bins of the cepstrum-input-signal, and setting the pitch-bin to a predefined value.
Abstract:
A speech-signal-processing-circuit configured to receive a time-frequency-domain-reference-speech-signal and a time-frequency-domain-degraded-speech-signal. The time-frequency-domain-reference-speech-signal comprises: an upper-band-reference-component with frequencies that are greater than a frequency-threshold-value; and a lower-band-reference-component with frequencies that are less than the frequency-threshold-value. The time-frequency-domain-degraded-speech-signal comprises: an upper-band-degraded-component with frequencies that are greater than the frequency-threshold-value; and a lower-band-degraded-component with frequencies that are less than the frequency-threshold-value. The speech-signal-processing-circuit comprises: a disturbance calculator configured to determine one or more SBR-features based on the time-frequency-domain-reference-speech-signal and the time-frequency-domain-degraded-speech-signal by: for each of a plurality of frames: determining a reference-ratio based on the ratio of (i) the upper-band-reference-component to (ii) the lower-band-reference-component; determining a degraded-ratio based on the ratio of (i) the upper-band-degraded-component to (ii) the lower-band-degraded-component; and determining a spectral-balance-ratio based on the ratio of the reference-ratio to the degraded-ratio; and (ii) determining the one or more SBR-features based on the spectral-balance-ratio for the plurality of frames.