SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS
    1.
    发明公开
    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS 有权
    系统和方法组合的俯仰角提取的频率范围内和时域语音信号

    公开(公告)号:EP1620844A4

    公开(公告)日:2008-10-08

    申请号:EP04758762

    申请日:2004-03-31

    Applicant: MOTOROLA INC IBM

    CPC classification number: G10L25/90

    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION
    2.
    发明公开
    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION 有权
    KLASSENQUANTISIERUNG用于分布式语音识别

    公开(公告)号:EP1595249A4

    公开(公告)日:2007-06-20

    申请号:EP04708622

    申请日:2004-02-05

    Applicant: MOTOROLA INC IBM

    CPC classification number: G10L25/93 G10L15/30 G10L25/90 G10L2025/935

    Abstract: A system, method and computer readable medium for quantizing class information and pitch information of audio is disclosed. The method on an information processing system includes receiving audio and capturing a frame of the audio. The method further includes determining a pitch of the frame and calculating a codeword representing the pitch of the frame, wherein a first codeword value indicates an indefinite pitch. The method further includes determining a class of the frame, wherein the class is any one of at least two classes indicating an indefinite pitch and at least one class indicating a definite pitch. The method further includes calculating a codeword representing the class of the frame, wherein the codeword length is the maximum of the minimum number of bits required to represent the at least two classes and the minimum number of bits required to represent the at least one class.

    PITCH QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION
    3.
    发明公开
    PITCH QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION 有权
    量化用于分布式语音识别

    公开(公告)号:EP1595244A4

    公开(公告)日:2006-03-08

    申请号:EP04708630

    申请日:2004-02-05

    Applicant: MOTOROLA INC IBM

    CPC classification number: G10L19/09 G10L15/30

    Abstract: A system, method and computer readable medium for quantizing pitch information of audio is disclosed. The method includes capturing audio representing a numbered frame of a plurality of numbered frames. The method further includes calculating a class of the frame, wherein a class is any one of a voiced or unvoiced class. If the frame is a voiced class, a pitch is calculated for the frame. If the frame is an even numbered frame and a voiced class, a codeword of a first length is calculated by absolutely quantizing the frame pitch. If the frame is an odd numbered frame and a voiced class and a reliable frame is available, a codeword of a second length is calculated by differentially quantizing the frame pitch. If there is no reliable frame available, a codeword of the second length is calculated by absolutely quantizing the frame pitch.

    PITCH QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION
    4.
    发明申请
    PITCH QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION 审中-公开
    用于分布式语音识别的定量定量

    公开(公告)号:WO2004072949A3

    公开(公告)日:2004-12-09

    申请号:PCT/US2004003425

    申请日:2004-02-05

    CPC classification number: G10L19/09 G10L15/30

    Abstract: A system, method and computer readable medium for quantizing pitch information of audio is disclosed. The method includes capturing audio representing a numbered frame of a plurality of numbered frames. The method further includes calculating a class of the frame, wherein a class is any one of a voiced or unvoiced class. If the frame is a voiced class, a pitch is calculated for the frame (903). If the frame is an even numbered frame and a voiced class, a codeword of first length is calculated by absolutely quantizing the frame pitch (910). If the frame is an odd numbered frame and a voiced class and a reliable frame is available, a codeword of a second length is calculated by differentially quantizing the frame pitch (905). If there is no reliable frame available, a codeword of the second length is calculated by absolutely quantizing the frame pitch.

    Abstract translation: 公开了一种用于量化音频的音调信息的系统,方法和计算机可读介质。 该方法包括捕获表示多个编号帧的编号帧的音频。 该方法还包括计算帧的类别,其中类是有声或无声类中的任何一个。 如果帧是浊音类,则为帧计算音高(903)。 如果帧是偶数帧和浊音类,则通过绝对量化帧间距来计算第一长度的码字(910)。 如果帧是奇数帧,并且有声类和可靠帧可用,则通过对帧间距进行差分量化来计算第二长度的码字(905)。 如果没有可靠的帧可用,则通过绝对量化帧间距来计算第二长度的码字。

    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION
    5.
    发明申请
    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION 审中-公开
    分类语音识别的类别量化

    公开(公告)号:WO2004072948A3

    公开(公告)日:2004-12-16

    申请号:PCT/US2004003419

    申请日:2004-02-05

    CPC classification number: G10L25/93 G10L15/30 G10L25/90 G10L2025/935

    Abstract: A system, method and computer readable medium for quantizing class information and pitch information of audio is disclosed. The method on an information processing system includes receiving audio and capturing a frame of the audio. The method further includes determining a pitch of the frame (604) and calculating a codeword representing the pitch of the frame (608), wherein a first codeword value indicates an indefinite pitch. The method further includes determining a class of the frame (610), wherein the class is any one of at least two classes indicating an indefinite pitch (614) and at least one class indicating a definite pitch (618). The method further includes calculating a codeword representing the class of the frame, wherein the codeword length is the maximum of the minimum number of bits required to represent the at least two classes and the minimum number of bits required to represent the at least one class (610).

    Abstract translation: 公开了用于量化音频的类信息和音调信息的系统,方法和计算机可读介质。 信息处理系统中的方法包括接收音频并捕获音频的帧。 该方法还包括确定帧的音调(604)并计算表示帧的音调的码字(608),其中第一码字值指示不确定音高。 所述方法还包括确定所述帧的类别(610),其中所述类别是指示不确定音调(614)的至少两个类别中的任何一个以及指示确定音高(618)的至少一个类别。 所述方法还包括计算表示所述帧的类别的码字,其中所述码字长度是表示所述至少两个类所需的最小比特数的最大值和表示所述至少一个类所需的最小比特数( 610)。

    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS
    6.
    发明申请
    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS 审中-公开
    用于语音信号的组合频域和时域音调提取的系统和方法

    公开(公告)号:WO2004090865A3

    公开(公告)日:2005-12-01

    申请号:PCT/US2004010119

    申请日:2004-03-31

    CPC classification number: G10L25/90

    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

    Abstract translation: 一种用于采样语音信号的系统,计算机可读介质和方法; 将采样的语音信号分成重叠的帧; 使用频域分析从帧中提取第一音调信息; 从所述第一音高信息提供至少一个音高候选者,每个音高候选者与频谱分数相关联,所述至少一个音高候选者中的每一个表示所述帧的可能音高估计值; 使用时域分析从帧中提取第二音调信息; 从第二音高信息提供至少一个音高候选者的相关分数; 以及选择所述至少一个音调候选中的一个以表示所述帧的音调估计。 该系统,计算机可读介质和方法适用于语音编码和分布式语音识别。

    8.
    发明专利
    未知

    公开(公告)号:BRPI0406952A

    公开(公告)日:2006-01-03

    申请号:BRPI0406952

    申请日:2004-02-05

    Applicant: MOTOROLA INC IBM

    Abstract: A system, method and computer readable medium for quantizing class information and pitch information of audio is disclosed. The method on an information processing system includes receiving audio and capturing a frame of the audio. The method further includes determining a pitch of the frame and calculating a codeword representing the pitch of the frame, wherein a first codeword value indicates an indefinite pitch. The method further includes determining a class of the frame, wherein the class is any one of at least two classes indicating an indefinite pitch and at least one class indicating a definite pitch. The method further includes calculating a codeword representing the class of the frame, wherein the codeword length is the maximum of the minimum number of bits required to represent the at least two classes and the minimum number of bits required to represent the at least one class.

    10.
    发明专利
    未知

    公开(公告)号:BRPI0406956A

    公开(公告)日:2006-01-03

    申请号:BRPI0406956

    申请日:2004-02-05

    Applicant: MOTOROLA INC IBM

    Abstract: A system, method and computer readable medium for quantizing pitch information of audio is disclosed. The method includes capturing audio representing a numbered frame of a plurality of numbered frames. The method further includes calculating a class of the frame, wherein a class is any one of a voiced or unvoiced class. If the frame is a voiced class, a pitch is calculated for the frame. If the frame is an even numbered frame and a voiced class, a codeword of a first length is calculated by absolutely quantizing the frame pitch. If the frame is an odd numbered frame and a voiced class and a reliable frame is available, a codeword of a second length is calculated by differentially quantizing the frame pitch. If there is no reliable frame available, a codeword of the second length is calculated by absolutely quantizing the frame pitch.

Patent Agency Ranking