-
公开(公告)号:EP1309964A4
公开(公告)日:2007-04-18
申请号:EP01951885
申请日:2001-07-12
Applicant: IBM
Inventor: CHAZAN DAN , ZIBULSKI MEIR , HOORY RON
CPC classification number: G10L25/90
Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.
-
公开(公告)号:JP2004110026A
公开(公告)日:2004-04-08
申请号:JP2003318910
申请日:2003-09-10
Applicant: Internatl Business Mach Corp
, インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Inventor: CHAZAN DAN , KONS ZVI
Abstract: PROBLEM TO BE SOLVED: To provide a method for improving the quality of a compressed speech.
SOLUTION: A speech encoder includes: a pitch detector for determining the pitch frequency of a speech segment; a spectrum estimator for calculating the complex spectrum of the speech segment at the pitch frequency; an envelope encoder for calculating the amplitude of the complex spectrum; a phase aligner for calculating a series of division products by the square of the absolute value of the plurality of complex values of the complex spectrum by excluding a phase period whose frequency is linear from the plurality of complex values of the complex spectrum; and a phase encoder for encoding phase information. The series of division products has minimum total variation and generates uniform phases θ
k .
COPYRIGHT: (C)2004,JPO-
公开(公告)号:WO0207363A3
公开(公告)日:2002-05-16
申请号:PCT/IL0100644
申请日:2001-07-12
Applicant: IBM , CHAZAN DAN , ZIBULSKI MEIR , HOORY RON
Inventor: CHAZAN DAN , ZIBULSKI MEIR , HOORY RON
CPC classification number: G10L25/90
Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval (42), and computing a second transform of the signal of the frequency domain over a second time interval (44), which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative (158), for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function (176, 178).
Abstract translation: 一种用于估计音频信号的音调频率的方法包括:在第一时间间隔(42)上计算信号到频域的第一变换(42),以及在第二时间间隔上计算频域信号的第二变换( 44),其包含第一时间间隔。 基于第一和第二变换,发现包括具有各自线路幅度和线路频率的谱线的频谱的信号线谱。 然后计算在频谱中的线的频率中周期性的效用函数(130)。 对于给定音调频率范围内的每个候选音调频率,该功能指示(158)频谱与候选音调频率的兼容性。 响应于效用函数来估计语音信号的音调频率(176,178)。
-
公开(公告)号:AU7272901A
公开(公告)日:2002-01-30
申请号:AU7272901
申请日:2001-07-12
Applicant: IBM
Inventor: CHAZAN DAN , ZIBULSKI MEIR , HOORY RON
IPC: G10L25/90
Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.
-
公开(公告)号:DE60136716D1
公开(公告)日:2009-01-08
申请号:DE60136716
申请日:2001-07-12
Applicant: IBM
Inventor: CHAZAN DAN , ZIBULSKI MEIR , HOORY RON
IPC: G10L25/90
Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval, and computing a second transform of the signal to the frequency domain over a second time interval, which contains the first time interval. A line spectrum of the signal is found, based on the first and second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative, for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function.
-
公开(公告)号:AU2003282317A1
公开(公告)日:2004-07-22
申请号:AU2003282317
申请日:2003-12-03
Applicant: IBM
Inventor: CHAZAN DAN
Abstract: A method for tracking pitch signal, including receiving a detected pitch signal that consists of a succession of pitch values, and for each current pitch value in the detected signal perform the following steps: constructing sub-sequences of consistent pitch values from neighboring pitch values. Next, calculating significance of the sub-sequences, and selecting a sub-sequence or a collection of consistent subsequences with highest significance. If the current pitch value is not consistent with the sub-sequence with highest significance, smoothing the current pitch value by diving it or multiplying it by an integer value>1, so as to render it consistent with the sub-sequence with highest significance.
-
公开(公告)号:GB2357231A
公开(公告)日:2001-06-13
申请号:GB0023864
申请日:2000-09-29
Applicant: IBM
Inventor: HOORY RON , CHAZAN DAN , SILVERA EZRA , ZILBULSKI MEIR
Abstract: In a method for encoding a digitized speech signal so as to generate data capable of being decoded as speech, a digitized speech signal is first converted to a series of feature vectors by deriving at successive instances of time, e.g. using ABS and Mel-Binning unit 32, an estimate of the spectral envelope of the digitized speech signal and multiplying each estimate of the spectral envelope by a predetermined set of frequency domain window functions, wherein each window occupies a narrow range of frequencies. The integrals thereof are computed and they or a set of predetermined functions thereof are assigned to respective components of a corresponding feature vector in the series of feature vectors. For each instance of time a respective pitch value of the digitized speech signal is computed at 34,35, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.
-
公开(公告)号:IL135192A
公开(公告)日:2004-06-20
申请号:IL13519200
申请日:2000-03-21
Applicant: IBM
Inventor: CHAZAN DAN , COHEN GILAD , HOORY RON
Abstract: A method for speech synthesis includes receiving an input speech signal containing a set of speech segments, and estimating spectral envelopes of the input speech signal in a succession of time intervals during each of the speech segments. The spectral envelopes are integrated over a plurality of window functions in a frequency domain so as to determine elements of feature vectors corresponding to the speech segments. An output speech signal is reconstructed by concatenating the feature vectors corresponding to a sequence of the speech segments.
-
公开(公告)号:GB2357231B
公开(公告)日:2004-06-09
申请号:GB0023864
申请日:2000-09-29
Applicant: IBM
Inventor: HOORY RON , CHAZAN DAN , SILVERA EZRA , ZILBULSKI MEIR
Abstract: A method for encoding a digitized speech signal so as to generate data capable of being decoded as speech. A digitized speech signal is first converted to a series of feature vectors using for example known Mel-frequency Cepstral coefficients (MFCC) techniques. At successive instances instance of time a respective pitch value of the digitized speech signal is computed, and successive acoustic vectors each containing the respective pitch value and feature vector are compressed so as to derive therefrom a bit stream. A suitable decoder reverses the operation so as to extract the features vectors and pitch values, thus allowing speech reproduction and playback. In addition, speech recognition is possible using the decompressed feature vectors, with no impairment of the recognition accuracy and no computational overhead.
-
公开(公告)号:CA2413138A1
公开(公告)日:2002-01-24
申请号:CA2413138
申请日:2001-07-12
Applicant: IBM
Inventor: ZIBULSKI MEIR , CHAZAN DAN , HOORY RON
Abstract: A method for estimating a pitch frequency of an audio signal includes computing a first transform of the signal to a frequency domain over a first time interval (42), and computing a second transform of the signal of the frequency domain over a second time interval (44), which contains the first time interval. A line spectrum of the signal is found, based on the first an d second transforms, the spectrum including spectral lines having respective line amplitudes and line frequencies. A utility function (130) that is periodic in the frequencies of the lines in the spectrum is then computed. This function is indicative (158), for each candidate pitch frequency in a given pitch frequency range, of a compatibility of the spectrum with the candidate pitch frequency. The pitch frequency of the speech signal is estimated responsive to the utility function (176, 178).
-
-
-
-
-
-
-
-
-