Abstract:
Described herein, is an arrangement and method for processing speech information in a speech recognition system (300). In such a system where the speech information is depicted as words, each word representing a sequence of frames (510) and where the recognition system has means (120) for comparing present input speech to a word template, the word template stored in template memory and derived from one or more previous input word, the present invention is best employed. The invention describes combining contiguous acoustically similar frames (512) derived from the previous input word or words into representative frames to form a corresponding reduced word template, storing the reduced word template in template memory in an efficient manner, and comparing frames of the present input speech to the representative frames of the reduced word template according to the number of frames combined in the representative frames of the reduced word template. In doing so, a measure of similarity between the present input speech and the word template is generated.
Abstract:
An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise supression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). The channel gain controller (240) produces these individual channel gain values for application to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.
Abstract:
An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise supression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). The channel gain controller (240) produces these individual channel gain values for application to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.
Abstract:
An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise supression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). The channel gain controller (240) produces these individual channel gain values for application to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.
Abstract:
An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise supression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). The channel gain controller (240) produces these individual channel gain values for application to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.
Abstract:
Described herein, is an arrangement and method for processing speech information in a speech recognition system (300). In such a system where the speech information is depicted as words, each word representing a sequence of frames (510) and where the recognition system has means (120) for comparing present input speech to a word template, the word template stored in template memory and derived from one or more previous input word, the present invention is best employed. The invention describes combining contiguous acoustically similar frames (512) derived from the previous input word or words into representative frames to form a corresponding reduced word template, storing the reduced word template in template memory in an efficient manner, and comparing frames of the present input speech to the representative frames of the reduced word template according to the number of frames combined in the representative frames of the reduced word template. In doing so, a measure of similarity between the present input speech and the word template is generated.
Abstract:
Described herein, is an arrangement and method for processing speech information in a speech recognition system (300). In such a system where the speech information is depicted as words, each word representing a sequence of frames (510) and where the recognition system has means (120) for comparing present input speech to a word template, the word template stored in template memory and derived from one or more previous input word, the present invention is best employed. The invention describes combining contiguous acoustically similar frames (512) derived from the previous input word or words into representative frames to form a corresponding reduced word template, storing the reduced word template in template memory in an efficient manner, and comparing frames of the present input speech to the representative frames of the reduced word template according to the number of frames combined in the representative frames of the reduced word template. In doing so, a measure of similarity between the present input speech and the word template is generated.
Abstract:
Arrangement and method for processing speech information in a speech recognition system. In such a system where the speech information is depicted as words, each word representing a sequence of frames and where the recognition system has means for comparing present input speech to a word template, the word template stored in template memory (160) and derived from one or more previous input word, the present invention is best employed. The invention describes combining (322) contiguous acoustically similar frames derived from the previous input word or words into representative frames to form a corresponding reduced word template, storing the reduced word template in template memory (160) in an efficient manner, and comparing (326) frames of the present input speech to the representative frames of the reduced word template according to the number of frames combined in the representative frames of the reduced word template. In doing so, a measure of similarity between the present input speech and the word template is generated.
Abstract:
An improved noise suppression system (400) which performs speech quality enhancement upon speech-plus-noise signal available at the input (205) to generate a clean speech signal at the output (265) by spectral gain modification. The noise suppression system includes a background noise estimator (420) which generates and stores an estimate of the background noise power spectral density based upon pre-processed speech (215), as determined by the detected minima of the post-processed speech energy level. This post-processed speech (255) may be obtained directly from the output of the noise supression system, or may be simulated by multiplying the pre-processed speech energy (225) by the channel gain values of the modification signal (245). The channel gain controller (240) produces these individual channel gain values for application to both the channel gain modifier (250) and the background noise estimator (420). Each individual channel gain value is selected as a function of (a) the channel number, (b) the current channel SNR estimate, and (c) the overall average background noise level. The technique of implementing post-processed signal to generate the background noise estimate (325) provides a more accurate measurement of the background noise energy since it is based upon much cleaner speech signal. As a result, the present invention performs acoustic noise suppression in high ambient noise backgrounds with significantly less voice quality degradation.