Abstract:
A communication device capable of screening speech recognizer input includes a microprocessor (110) connected to communication interface circuitry (115), memory (120), audio circuitry (130), an optional keypad (140), a display (150), and a vibrator/buzzer (160). Audio circuitry (130) is connected to microphone (133) and speaker (135). Microprocessor (110) includes a speech/noise classifier and speech recognition technology. Microprocessor (110) analyzes a speech signal to determine speech waveform parameters within a speech acquisition window. Microprocessor (110) compares the speech waveform parameters to determine whether an error exists in the signal format of the speech signal. Microprocessor (110) informs the user when an error exists in the signal format and instructs the user how to correct the signal format to eliminate the error.
Abstract:
A communication device capable of screening speech recognizer input includes a microprocessor (110) connected to communication interface circuitry (115), memory (120), audio circuitry (130), an optional keypad (140), a display (150), and a vibrator/buzzer (160). Audio circuitry (130) is connected to microphone (133) and speaker (135). Microprocessor (110) includes a speech/noise classifier and speech recognition technology. Microprocessor (110) analyzes a speech signal to determine speech waveform parameters within a speech acquisition window. Microprocessor (110) compares the speech waveform parameters to determine whether an error exists in the signal format of the speech signal. Microprocessor (110) informs the user when an error exists in the signal format and instructs the user how to correct the signal format to eliminate the error.
Abstract:
In a statistical based speech recognition system, one of the key issues is the selection of the Hidden Markov Model that best matches a given sequence of feature observations. The problem is usually addressed by the calculation of the maximum likelihood, ML, state sequence by means of a Viterbi or other decoder. Noise or inadequate training can produce an ML sequence associated with a Hidden Markov Model other than the correct model. The method of the present invention provides improved robustness by combining the standard ML state sequence score (416) with an additional path core (418) derived from the dynamics of the ML score as a function of time. These two scores, when combined, form a hybrid metric (420) that, when used with the decoder, optimizes selection of the correct Hidden Markov Model (422).
Abstract:
A method of reconstructing speech input at a communication device comprises receiving, at the communication device, encoded data that includes encoded spectral data and encoded energy data of the speech input, the encoded spectral data being encoded as a series of mel-frequency cepstral coefficients. The method further comprises decoding, at the communication device, the encoded spectral data and encoded energy data to determine the spectral data and energy data, wherein decoding comprises: performing an inverse discrete cosine transform on the mel-frequency cepstral coefficients at harmonic mel-frequencies corresponding to a pitch period of the speech input to determine log-spectral magnitudes of the speech input at the harmonic mel-frequencies, and exponentiating the log-spectral magnitudes to determine the spectral magnitudes of the speech input. The method also comprises combining the spectral data and energy data to reconstruct the speech input at the communication device. A communication device for use in distributed speech recognition system is also disclosed.
Abstract:
A method for equalizing a speech signal generated within a pressurized air delivery system, the method including the steps of: generating an inhalation noise model (1152) based on inhalation noise; receiving an input signal (802) that includes a speech signal; and equalizing the speech signal (1156) based on the noise model.
Abstract:
A method for equalizing a speech signal generated within a pressurized air delivery system, the method including the steps of: generating an inhalation noise model (1152) based on inhalation noise; receiving an input signal (802) that includes a speech signal; and equalizing the speech signal (1156) based on the noise model.
Abstract:
A method for characterizing inhalation noise within a pressurized air delivery system, the method including the steps of: generating an inhalation noise model (912, 1012) based on inhalation noise; receiving an input signal (802) that includes inhalation noise comprising at least one inhalation noise burst; comparing (810) the input signal to the noise model to obtain a similarity measure; comparing the similarity measure to at least one threshold (832, 834) to detect the at least one inhalation noise burst; and characterizing (1354, 1356) the at least one detected inhalation noise burst.
Abstract:
A method for equalizing a speech signal generated within a pressurized air delivery system, the method including the steps of: generating an inhalation noise model (1152) based on inhalation noise; receiving an input signal (802) that includes a speech signal; and equalizing the speech signal (1156) based on the noise model.
Abstract:
In a statistical based speech recognition system, one of the key issues is the selection of the Hidden Markov Model that best matches a given sequence of feature observations. The problem is usually addressed by the calculation of the maximum likelihood, ML, state sequence by means of a Viterbi or other decoder. Noise or inadequate training can produce an ML sequence associated with a Hidden Markov Model other than the correct model. The method of the present invention provides improved robustness by combining the standard ML state sequence score (416) with an additional path core (418) derived from the dynamics of the ML score as a function of time. These two scores, when combined, form a hybrid metric (420) that, when used with the decoder, optimizes selection of the correct Hidden Markov Model (422).