FAST FREQUENCY-DOMAIN PITCH ESTIMATION
    1.
    发明公开
    FAST FREQUENCY-DOMAIN PITCH ESTIMATION 有权
    快速频率范围Pitch评估

    公开(公告)号:EP1309964A4

    公开(公告)日:2007-04-18

    申请号:EP01951885

    申请日:2001-07-12

    Applicant: IBM

    CPC classification number: G10L25/90

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

    FAST FREQUENCY-DOMAIN PITCH ESTIMATION
    3.
    发明申请
    FAST FREQUENCY-DOMAIN PITCH ESTIMATION 审中-公开
    快速频域点估计

    公开(公告)号:WO0207363A3

    公开(公告)日:2002-05-16

    申请号:PCT/IL0100644

    申请日:2001-07-12

    CPC classification number: G10L25/90

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval (42), and computing a second transform of the signal of the frequency domain over a second time interval (44), which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative (158), for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function (176, 178).

    Abstract translation: 一种用于估计音频信号的音调频率的方法包括:在第一时间间隔(42)上计算信号到频域的第一变换(42),以及在第二时间间隔上计算频域信号的第二变换( 44),其包含第一时间间隔。 基于第一和第二变换,发现包括具有各自线路幅度和线路频率的谱线的频谱的信号线谱。 然后计算在频谱中的线的频率中周期性的效用函数(130)。 对于给定音调频率范围内的每个候选音调频率,该功能指示(158)频谱与候选音调频率的兼容性。 响应于效用函数来估计语音信号的音调频率(176,178)。

    Fast frequency-domain pitch estimation

    公开(公告)号:AU7272901A

    公开(公告)日:2002-01-30

    申请号:AU7272901

    申请日:2001-07-12

    Applicant: IBM

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

    5.
    发明专利
    未知

    公开(公告)号:DE60136716D1

    公开(公告)日:2009-01-08

    申请号:DE60136716

    申请日:2001-07-12

    Applicant: IBM

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.

    A METHOD FOR TRACKING A PITCH SIGNAL

    公开(公告)号:AU2003282317A1

    公开(公告)日:2004-07-22

    申请号:AU2003282317

    申请日:2003-12-03

    Applicant: IBM

    Inventor: CHAZAN DAN

    Abstract: A method for tracking pitch signal, including receiving a detected pitch signal that consists of a succession of pitch values, and for each current pitch value in the detected signal perform the following steps: constructing sub-sequences of consistent pitch values from neighboring pitch values. Next, calculating significance of the sub-sequences, and selecting a sub-sequence or a collection of consistent subsequences with highest significance. If the current pitch value is not consistent with the sub-sequence with highest significance, smoothing the current pitch value by diving it or multiplying it by an integer value>1, so as to render it consistent with the sub-sequence with highest significance.

    Encoding and decoding speech signals

    公开(公告)号:GB2357231A

    公开(公告)日:2001-06-13

    申请号:GB0023864

    申请日:2000-09-29

    Applicant: IBM

    Abstract: In a method for encoding a digitized speech signal so as to generate data capable of being decoded as speech, a digitized speech signal is first converted to a series of feature vectors by deriving at successive instances of time, e.g. using ABS and Mel-Binning unit 32, an estimate of the spectral envelope of the digitized speech signal and multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions, wherein each window occupies a narrow range of frequencies. The integrals thereof are computed and they or a set of predetermined functions thereof are assigned to respective components of a corresponding feature vector in the series of feature vectors. For each instance of time a respective pitch value of the digitized speech signal is computed at 34,35, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.

    METHOD AND SYSTEM FOR SPEECH RECONSTRUCTION FROM SPEECH RECOGNITION FEATURES

    公开(公告)号:IL135192A

    公开(公告)日:2004-06-20

    申请号:IL13519200

    申请日:2000-03-21

    Applicant: IBM

    Abstract: A method for speech synthesis includes receiving an input speech signal containing a set of speech segments, and estimating spectral envelopes of the input speech signal in a succession of time intervals during each of the speech segments. The spectral envelopes are integrated over a plurality of window functions in a frequency domain so as to determine elements of feature vectors corresponding to the speech segments. An output speech signal is reconstructed by concatenating the feature vectors corresponding to a sequence of the speech segments.

    Method and system for encoding and decoding speech signals

    公开(公告)号:GB2357231B

    公开(公告)日:2004-06-09

    申请号:GB0023864

    申请日:2000-09-29

    Applicant: IBM

    Abstract: A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.

    FAST FREQUENCY-DOMAIN PITCH ESTIMATION

    公开(公告)号:CA2413138A1

    公开(公告)日:2002-01-24

    申请号:CA2413138

    申请日:2001-07-12

    Applicant: IBM

    Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval (42), and computing a second transform of the signal of the frequency domain over a second time interval (44), which contains the first time interval. A line spectrum of the signal is found, based on the first an d second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative (158), for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function (176, 178).

Patent Agency Ranking