WAVELET-BASED ENERGY BINNING CEPSTRAL FEATURES FOR AUTOMATICSPEECH RECOGNITION

    公开(公告)号:CA2290185A1

    公开(公告)日:2000-05-30

    申请号:CA2290185

    申请日:1999-11-22

    Applicant: IBM

    Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves "synchrosqueezing" spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as "K-mean Wastrum." In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as "formant-based wastrum." Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.

    22.
    发明专利
    未知

    公开(公告)号:AT338305T

    公开(公告)日:2006-09-15

    申请号:AT02747928

    申请日:2002-06-20

    Applicant: IBM

    Abstract: A system and method for intelligent caching and network management includes contextual information representing needs of a user. A contextual system determines settings based on the contextual information and determines services and devices available for the user, in accordance with the contextual information. A predictor receives the contextual information, the settings, the services available and the devices available and predicts the needs of the user to make resources available to the user in accordance with predictions.

    25.
    发明专利
    未知

    公开(公告)号:DE69937962D1

    公开(公告)日:2008-02-21

    申请号:DE69937962

    申请日:1999-10-01

    Applicant: IBM

    Abstract: A method for conversational computing includes executing code embodying a conversational virtual machine, registering a plurality of input/output resources with a conversational kernel, providing an interface between a plurality of active applications and the conversational kernel processing input/output data, receiving input queries and input events of a multi-modal dialog across a plurality of user interface modalities of the plurality of active applications, generating output messages and output events of the multi-modal dialog in connection with the plurality of active applications, managing, by the conversational kernel, a context stack associated with the plurality of active applications and the multi-modal dialog to transform the input queries into application calls for the plurality of active applications and convert the output messages into speech, wherein the context stack accumulates a context of each of the plurality of active applications.

    WAVELET-BASED ENERGY BINNING CEPSTRAL FEATURES FOR AUTOMATICSPEECH RECOGNITION

    公开(公告)号:CA2290185C

    公开(公告)日:2005-09-20

    申请号:CA2290185

    申请日:1999-11-22

    Applicant: IBM

    Abstract: Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves "synchrosqueezing" spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the; components are, thus, extracted. The cepstrum generated from this information is referred to as "K-mean Wastrum." In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as "formant-based wastrum." Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.

Patent Agency Ranking