Abstract:
Provided is an encoder apparatus that can suppress the quality degradation of encoding processes and also can reduce the processing amount of the encoder apparatus in an encoding system in which the encoding process suitable for voice signals and the encoding process suitable for music signals are combined in a hierarchical structure. In the apparatus: an ultimate selection candidate limiting unit (109) uses the spectrum of an input signal and a residual spectrum to designate a given number of pre-selected suppression factors to a CELP component suppressing unit (104); the CELP component suppressing unit (104) uses the designated suppression factors to generate a suppressed spectrum; a CELP residual signal spectrum calculating unit (105), to which the suppressed spectrum is input, calculates a residual spectrum; a conversion encoding unit (110) uses the residual spectrum to performs a second encoding process; and a distortion evaluating unit (112) determines one of the designated suppression factors by use of the spectrum of a second decoded signal generated by decoding a second code obtained by the second encoding process and further by use of the suppressed spectrum and the spectrum of the input signal.
Abstract:
Provided is an encoder apparatus that can suppress the quality degradation of encoding processes and also can reduce the processing amount of the encoder apparatus in an encoding system in which the encoding process suitable for voice signals and the encoding process suitable for music signals are combined in a hierarchical structure. In the apparatus: an ultimate selection candidate limiting unit (109) uses the spectrum of an input signal and a residual spectrum to designate a given number of pre-selected suppression factors to a CELP component suppressing unit (104); the CELP component suppressing unit (104) uses the designated suppression factors to generate a suppressed spectrum; a CELP residual signal spectrum calculating unit (105), to which the suppressed spectrum is input, calculates a residual spectrum; a conversion encoding unit (110) uses the residual spectrum to performs a second encoding process; and a distortion evaluating unit (112) determines one of the designated suppression factors by use of the spectrum of a second decoded signal generated by decoding a second code obtained by the second encoding process and further by use of the suppressed spectrum and the spectrum of the input signal.
Abstract:
There is disclosed a scalable decoding apparatus for outputting a mixed signal, in which a core-layer decoded signal and extended-layer decoded signal are mixed when reconstructing an output speech signal. The scalable decoding apparatus comprises a mixing section for mixing the core-layer decoded signal and the extended-layer decoded signal while changing a mixing ratio of the core-layer decoded signal and the extended-layer decoded signal over time, and obtaining the mixed signal. A setting section variably sets a degree of change over time of the mixing ratio.
Abstract:
Provided is an audio decoding device capable of suppressing an information amount for a lost frame compensation process and encoding efficiency. In this device, a decoded sound source generation unit (203) generates a lost frame decoded sound source signal; a pitch pulse information decoding unit (204) decodes the pitch pulse position information and the pitch pulse amplitude information; a pitch pulse waveform learning unit (205) learns a pitch pulse learning waveform in the past frame in advance from the lost frame; a convolution unit (206) amplitude-adjusts the pitch pulse learning waveform according to the pitch pulse amplitude information, and convolutes the pitch pulse waveform into a time axis which has been amplitude-adjusted according to the pitch pulse position information; a sound source signal correction unit (207) adds or replaces the pitch pulse waveform convoluted into the time axis to the lost frame decoded sound source signal.
Abstract:
Disclosed is an audio signal decoding device and a method of balance adjustment that reduces a fluctuation of a decoded signal orientation and maintains a stereo perception. An interchannel correlation computation unit (224) computes a correlation between a left channel decoded stereo signal and a right channel decoded stereo signal, and if the interchannel correlation is low, a peak detection unit (225) uses a peak component of a decoded monaural signal of the current frame and a peak component of either a left or a right channel of the preceding frame to detect a peak component with a high temporal correlation. The peak detection unit (225) combines and outputs, from among the frequencies of the detected peak components, a peak frequency of a frame n - 1 and a peak frequency of a frame n. A peak balance coefficient computation unit (226) computes, from the peak frequency of the frame n - 1, a balance parameter that is used in converting a peak frequency component of the monaural signal to stereo.
Abstract:
Disclosed are an audio encoding device and an audio decoding device which reduce degradation of subjective quality of a decoding signal caused by power mismatch of a decoding signal which is generated by a concealing process upon disappearance of a frame. When a frame is lost, a past encoding parameter is used to obtain a concealed LPC of the current frame and a concealed sound source parameter. A normal CELP decoding is performed from the obtained concealed sound source parameter. Correction is performed by using a conceal parameter on the obtained concealed LPC and the concealed sound source signal. The power of the corrected concealed sound source signal is adjusted to match a reference sound source power. A filter gain of the synthesis filter is adjusted so as to adjust the power of a decoded sound signal to the power of a decoded sound signal during an error-free state. Moreover, a synthesis filter gain adjusting coefficient is calculated by using an estimated normalized residual power so that a filter gain of a synthesis filter formed by using a concealed LPC is a filter gain during an error-free state.