Abstract:
Embodiments of the present invention relate to a voice detector receiving an input signal that is divided into sub-signals that represent a frequency sub-band. The voice detector calculates, for each sub-band, a signal-to-noise (SNR) value based on a corresponding sub-signal for each sub-band and a background signal for each sub-band. The voice detector also calculates a power SNR value for each sub-band, where at least one of the power SNR values is calculated based on a non-linear function. The voice detector forms a single value based on the calculated power SNR values and compares the single value and a given threshold value to make a voice activity decision presented on an output port.
Abstract:
The invention relates to the technical field of audio encoding and/or decoding technologies, and thus concerns an overall encoding procedure and associated decoding procedure. The encoding procedure involves at least two signal encoding processes (S1-S3) operating on signal representations of a set of audio input channels, as well as residual encoding (S7-S8). It also involves a dedicated process (S4-S6) to estimate and encode energies of the audio input channels. Each encoding process is associated with a corresponding decoding process. In the overall decoding procedure the decoded signals from each encoding process are preferably combined such that the output channels are close to the input channels in terms of energy and/or quality. Normally, the combination step also adapts to the possible loss of one or more signal representation in part or in whole, such that the energy and quality is optimized with the signals at hand in the decoder. In this way, the overall quality of the output channels is improved.
Abstract:
The present invention relates to a voice detector 30; 51; 61 being responsive to an input signal being divided into sub-signals representing a frequency sub-band, comprising: means to calculate 20, for each sub-band, an SNR value snr[n] based on a corresponding sub-signal for each sub-band and a background signal for each sub-band. The voice detector 30; 51; 61 further comprises: means to calculate 31 n , 21 a power SNR value for each sub-band, wherein at least one of said power SNR values is calculated based on a non- linear function, means to form 22 a single value snr_sum based on the calculated power SNR values, and means to compare 23 said single value snr_sum and a given threshold value vad_thr to make a voice activity decision vad_prim presented on an output port. The invention also relates to a voice activity detector, a node and a method for selectively suppressing sub-bands in a voice detector.
Abstract:
L'invention concerne un procédé, un décodeur et un code de programme pour commander un procédé de dissimulation pour une trame audio perdue. Une première trame audio et une seconde trame audio d'un signal audio reçu sont décodées pour obtenir des coefficients de transformée en cosinus discrète modifiée (mdct). Des valeurs d'une première forme spectrale basées sur les coefficients mdct décodés à partir de la première trame audio décodée et des valeurs d'une seconde forme spectrale basées sur des coefficients mdct décodés à partir de la seconde trame audio décodée sont déterminées, les formes spectrales comprenant chacune un certain nombre de sous-bandes. Les valeurs des formes spectrales et des énergies de trame de la première trame audio et de la seconde trame audio sont transformées en représentations d'analyses spectrales basées sur des fft. Une condition transitoire est détectée sur la base des représentations des fft. En réponse à la détection de la condition transitoire, le procédé de dissimulation est modifié par réglage sélectif d'une amplitude de spectre d'un spectre de trame de substitution.
Abstract:
Un método para la estimación de ruido de fondo en un segmento de señal de audio que comprende una pluralidad de subbandas, comprendiendo el método: calcular una posible estimación de nuevo ruido de subbanda y actualizar una estimación de ruido de subbanda actual con la estimación de nuevo ruido de subbanda si el nuevo valor es menor que el valor actual; y cuando el nivel de energía del segmento de señal de audio es menor que un umbral más alto (202:2) que un nivel de energía mínimo a largo plazo It_min, pero no se detecta ninguna pausa (204:1) en el segmento de señal de audio: - determinar (203) si el segmento de señal de audio comprende música; y - reducir (206) la estimación de ruido de subbanda actual si se determina que el segmento de señal de audio (203:2) comprende música y la estimación de ruido de subbanda actual excede un valor mínimo (205:1).