Abstract:
A Vector-Sum Excited Linear Predictive Coding (VSELP) speech coder (200) provides improved quality and reduced complexity over a typical speech coder. VSELP uses a codebook (201) which has a predefined structure such that the computations required for the codebook search process can be significantly reduced. This VSELP speech coder uses single or multisegment vector quantizer of the reflection coefficients based on a Fixed-Point-Lattice-Technique (FLAT). Additionally, this speech coder uses a pre-quantizer to reduce the vector codebook search complexity and a high-resolution scalar quantizer to reduce the amount of memory needed to store the reflection coefficient vector codebooks. Resulting in a high quality speech coder with reduced computations and storage requirements.
Abstract:
In a statistical based speech recognition system, one of the key issues is the selection of the Hidden Markov Model that best matches a given sequence of feature observations. The problem is usually addressed by the calculation of the maximum likelihood, ML, state sequence by means of a Viterbi or other decoder. Noise or inadequate training can produce an ML sequence associated with a Hidden Markov Model other than the correct model. The method of the present invention provides improved robustness by combining the standard ML state sequence score (416) with an additional path core (418) derived from the dynamics of the ML score as a function of time. These two scores, when combined, form a hybrid metric (420) that, when used with the decoder, optimizes selection of the correct Hidden Markov Model (422).
Abstract:
In a statistical based speech recognition system, one of the key issues is the selection of the Hidden Markov Model that best matches a given sequence of feature observations. The problem is usually addressed by the calculation of the maximum likelihood, ML, state sequence by means of a Viterbi or other decoder. Noise or inadequate training can produce an ML sequence associated with a Hidden Markov Model other than the correct model. The method of the present invention provides improved robustness by combining the standard ML state sequence score (416) with an additional path core (418) derived from the dynamics of the ML score as a function of time. These two scores, when combined, form a hybrid metric (420) that, when used with the decoder, optimizes selection of the correct Hidden Markov Model (422).
Abstract:
Analysis by synthesis calculates a difference by subtracting (130) synthesized speech from input speech. The synthesized speech is formed by exciting long and short term filters (124, 126) with excitation vectors from a codebook store (114) which is searched by codebook generation (120). A weighting filter (132) is applied to the difference signal and the weighted difference is used to calculate an energy measure (134) which is used to control the codebook search (140). The weighting filter is an Rth-order filter controlled with calculated coefficients. The method for calculating coefficients models the frequency response of L Pth-order filters by a single Rth-order filter, where the order R