METHOD AND APPARATUS FOR DOUBLE-TALK DETECTION IN A HANDS-FREE COMMUNICATION SYSTEM
    31.
    发明申请
    METHOD AND APPARATUS FOR DOUBLE-TALK DETECTION IN A HANDS-FREE COMMUNICATION SYSTEM 审中-公开
    无通信通信系统中双重检测的方法和装置

    公开(公告)号:WO2007062287A2

    公开(公告)日:2007-05-31

    申请号:PCT/US2006060656

    申请日:2006-11-08

    CPC classification number: H04M9/082

    Abstract: An echo canceling circuit comprising a double talk detector, an upper band signal filter configured to pass only near-end upper band signals to the double talk detector and remove lower band signals, an adaptive filter circuit, a control circuit operatively coupled to the double talk detector and to the adaptive filter circuit, and a threshold estimator configured to iteratively calculate an upper adaptive decision threshold value and a lower adaptive decision threshold value. The double talk detector declares near-end speech to be present if an estimated power level of the upper band signals exceeds the upper adaptive decision threshold value, and declares the near-end speech to be absent if the estimated power level of the upper band signals falls below the lower adaptive decision threshold value for a predetermined number of iterative cycles.

    Abstract translation: 一种回声消除电路,包括双通话检测器,上频带信号滤波器,其被配置为仅将近端高频带信号传递到双方通话检测器并去除低频带信号;自适应滤波器电路;可操作地耦合到双重通话的控制电路 检测器和自适应滤波器电路;以及阈值估计器,被配置为迭代地计算上自适应判决阈值和较低自适应判决阈值。 如果高频信号的估计功率电平超过上限自适应判定阈值,双通话检测器声明近端语音存在,并且如果高频信号的估计功率电平则声明近端语音不存在 在预定数量的迭代循环下降到低于自适应判决阈值以下。

    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION
    32.
    发明申请
    CLASS QUANTIZATION FOR DISTRIBUTED SPEECH RECOGNITION 审中-公开
    分类语音识别的类别量化

    公开(公告)号:WO2004072948A3

    公开(公告)日:2004-12-16

    申请号:PCT/US2004003419

    申请日:2004-02-05

    CPC classification number: G10L25/93 G10L15/30 G10L25/90 G10L2025/935

    Abstract: A system, method and computer readable medium for quantizing class information and pitch information of audio is disclosed. The method on an information processing system includes receiving audio and capturing a frame of the audio. The method further includes determining a pitch of the frame (604) and calculating a codeword representing the pitch of the frame (608), wherein a first codeword value indicates an indefinite pitch. The method further includes determining a class of the frame (610), wherein the class is any one of at least two classes indicating an indefinite pitch (614) and at least one class indicating a definite pitch (618). The method further includes calculating a codeword representing the class of the frame, wherein the codeword length is the maximum of the minimum number of bits required to represent the at least two classes and the minimum number of bits required to represent the at least one class (610).

    Abstract translation: 公开了用于量化音频的类信息和音调信息的系统,方法和计算机可读介质。 信息处理系统中的方法包括接收音频并捕获音频的帧。 该方法还包括确定帧的音调(604)并计算表示帧的音调的码字(608),其中第一码字值指示不确定音高。 所述方法还包括确定所述帧的类别(610),其中所述类别是指示不确定音调(614)的至少两个类别中的任何一个以及指示确定音高(618)的至少一个类别。 所述方法还包括计算表示所述帧的类别的码字,其中所述码字长度是表示所述至少两个类所需的最小比特数的最大值和表示所述至少一个类所需的最小比特数( 610)。

    VOICE QUALITY CONTROL FOR HIGH QUALITY SPEECH RECONSTRUCTION
    33.
    发明申请
    VOICE QUALITY CONTROL FOR HIGH QUALITY SPEECH RECONSTRUCTION 审中-公开
    高品质语音重建的语音质量控制

    公开(公告)号:WO2007067837A3

    公开(公告)日:2008-06-05

    申请号:PCT/US2006060935

    申请日:2006-11-15

    CPC classification number: G10L25/69 G10L15/26

    Abstract: A method and apparatus are provided for reproducing a speech sequence of a user through a communication device of the user. The method includes the steps of detecting a speech sequence from the user through the communication device, recognizing a phoneme sequence within the detected speech sequence and forming a confidence level of each phoneme within the recognized phoneme sequence. The method further includes the steps of audibly reproducing the recognized phoneme sequence for the user through the communication device and gradually highlighting or degrading a voice quality of at least some phonemes of the recognized phoneme sequence based upon the formed confidence level of the at least some phonemes.

    Abstract translation: 提供了一种用于通过用户的通信设备再现用户的语音序列的方法和装置。 该方法包括以下步骤:通过通信设备检测来自用户的语音序列,识别检测到的语音序列内的音素序列,并形成识别的音素序列内每个音素的置信度。 该方法还包括以下步骤:通过通信设备可听地再现用户的识别音素序列,并且基于形成的至少一些音素的置信水平逐渐突出或降低所识别的音素序列的至少一些音素的语音质量 。

    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS
    34.
    发明申请
    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS 审中-公开
    用于语音信号的组合频域和时域提取的系统和方法

    公开(公告)号:WO2004095420A3

    公开(公告)日:2005-06-09

    申请号:PCT/US2004008646

    申请日:2004-03-19

    CPC classification number: G10L25/90

    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

    Abstract translation: 一种用于对语音信号进行采样的系统,计算机可读介质和方法; 将采样语音信号划分成重叠帧; 使用频域分析从帧中提取第一音调信息; 从所述第一音调信息提供与频谱分数相关联的至少一个音调候选者,所述至少一个音调候选中的每一个表示所述帧的可能音调估计; 使用时域分析从帧中提取第二音调信息; 从所述第二音调信息提供所述至少一个音调候选的相关得分; 以及选择所述至少一个音调候选中的一个以表示所述帧的音调估计。 该系统,计算机可读介质和方法适用于语音编码和分布式语音识别。

    AN ADAPTIVE EQUALIZER FOR A CODED SPEECH SIGNAL
    35.
    发明申请
    AN ADAPTIVE EQUALIZER FOR A CODED SPEECH SIGNAL 审中-公开
    用于编码语音信号的自适应均衡器

    公开(公告)号:WO2007047037A3

    公开(公告)日:2009-04-09

    申请号:PCT/US2006037408

    申请日:2006-09-26

    CPC classification number: G10L19/26

    Abstract: A speech communication system provides a speech encoder [100] that generates a set of coded parameters representative of the desired speech signal characteristics. The speech communication system also provides a speech decoder [200] that receives the set of coded parameters to generate reconstructed speech. The speech decoder includes an equalizer [204] that computes a matching set of parameters from the reconstructed speech [301] generated by the speech decoder [200], undoes the set of characteristics corresponding to the computed set of parameters, and imposes the set of characteristics corresponding to the coded set of parameters, thereby producing equalized reconstructed speech [306].

    Abstract translation: 语音通信系统提供语音编码器[100],其产生表示所需语音信号特性的编码参数集合。 语音通信系统还提供一种语音解码器[200],其接收编码参数集合以产生重构语音。 语音解码器包括:均衡器[204],其从语音解码器[200]生成的重构语音[301]中计算一组匹配的参数,撤销对应于所计算的参数集合的一组特征,并且将 特征对应于编码的参数集合,从而产生均衡的重构语音[306]。

    METHOD AND APPARATUS FOR DOUBLE-TALK DETECTION IN A HANDS-FREE COMMUNICATION SYSTEM
    36.
    发明申请
    METHOD AND APPARATUS FOR DOUBLE-TALK DETECTION IN A HANDS-FREE COMMUNICATION SYSTEM 审中-公开
    用于在免提通信系统中进行双语检测的方法和设备

    公开(公告)号:WO2007062287A8

    公开(公告)日:2008-08-21

    申请号:PCT/US2006060656

    申请日:2006-11-08

    CPC classification number: H04M9/082

    Abstract: An echo canceling circuit comprising a double talk detector, an upper band signal filter configured to pass only near-end upper band signals to the double talk detector and remove lower band signals, an adaptive filter circuit, a control circuit operatively coupled to the double talk detector and to the adaptive filter circuit, and a threshold estimator configured to iteratively calculate an upper adaptive decision threshold value and a lower adaptive decision threshold value. The double talk detector declares near-end speech to be present if an estimated power level of the upper band signals exceeds the upper adaptive decision threshold value, and declares the near-end speech to be absent if the estimated power level of the upper band signals falls below the lower adaptive decision threshold value for a predetermined number of iterative cycles.

    Abstract translation: 一种回声消除电路,包括双方讲话检测器,被配置为仅将近端高频带信号传递到双方讲话检测器并移除较低频带信号的高频带信号滤波器,自适应滤波器电路,操作性地耦合到双方讲话的控制电路 检测器和自适应滤波器电路,以及阈值估计器,被配置为迭代地计算上自适应判决阈值和下自适应判决阈值。 如果所述较高频带信号的估计功率电平超过所述较高自适应判决阈值,则所述双重通话检测器宣告近端语音存在,并且如果所述较高频带信号的所估计功率电平宣告所述近端语音不存在, 低于较低的自适应判定阈值达预定数量的迭代周期。

    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS
    37.
    发明申请
    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS 审中-公开
    用于语音信号的组合频域和时域音调提取的系统和方法

    公开(公告)号:WO2004090865A3

    公开(公告)日:2005-12-01

    申请号:PCT/US2004010119

    申请日:2004-03-31

    CPC classification number: G10L25/90

    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

    Abstract translation: 一种用于采样语音信号的系统,计算机可读介质和方法; 将采样的语音信号分成重叠的帧; 使用频域分析从帧中提取第一音调信息; 从所述第一音高信息提供至少一个音高候选者,每个音高候选者与频谱分数相关联,所述至少一个音高候选者中的每一个表示所述帧的可能音高估计值; 使用时域分析从帧中提取第二音调信息; 从第二音高信息提供至少一个音高候选者的相关分数; 以及选择所述至少一个音调候选中的一个以表示所述帧的音调估计。 该系统,计算机可读介质和方法适用于语音编码和分布式语音识别。

    METHOD AND APPARATUS FOR SPEECH CODING
    38.
    发明公开
    METHOD AND APPARATUS FOR SPEECH CODING 审中-公开
    方法和设备语音编码

    公开(公告)号:EP1697925A4

    公开(公告)日:2009-07-08

    申请号:EP04814785

    申请日:2004-12-17

    Applicant: MOTOROLA INC

    CPC classification number: G10L19/09

    Abstract: A method (Fig. 9) and apparatus (500, 600) for prediction in a speech-coding system extends a 1st order long-term predictor (LTP) filter, using a sub-sample resolution delay, to a multi-tap LTP filter (504, 604). From another perspective, a conventional integer-sample resolution multi-tap LTP filter is extended to use sub-sample resolution delay. Such a multi-tap LTP filter offers a number of advantages over the prior-art. Particularly, defining the lag with sub-sample resolution makes it possible to explicitly model the delay values that have a fractional component, within the limits of resolution of the over-sampling factor used by the interpolation filter. The coefficients (ßi's) of the multi-tap LTP filter are thus largely freed from modeling the effect of delays that have a fractional component. Consequently their main function is to maximize the prediction gain of the LTP filter via modeling the degree of periodicity that is present and by imposing spectral shaping.

    METHOD AND APPARATUS FOR SUPPRESSING ACOUSTIC BACKGROUND NOISE IN A COMMUNICATION SYSTEM
    39.
    发明公开
    METHOD AND APPARATUS FOR SUPPRESSING ACOUSTIC BACKGROUND NOISE IN A COMMUNICATION SYSTEM 审中-公开
    方法及器具的声学背景噪声的通信系统还原

    公开(公告)号:EP1238479A4

    公开(公告)日:2005-07-27

    申请号:EP00980890

    申请日:2000-11-30

    Applicant: MOTOROLA INC

    CPC classification number: H04S1/007

    Abstract: A method and apparatus for suppressing acoustic background noise in a communication system. An operating signal-to-noise ratio (SNR) level is reliably evaluated from channel energy (293) and background noise energy (294) values by a SNR level estimator (295). A minimum gain factor and a gain slope are adapted (290) depending on the operating SNR level. Using these adapted values and the channel SNR, the channel gain is selected (233). When the channel SNR is below a certain threshold, the channel is completely noise-like and the gain factor selected is minimum so that the channel is maximally attenuated. When the channel SNR is fairly high, the channel gain selected is 0 dB. For intermediate values of channel SNR, the gain factor selected lies between minimum and 0 dB.

    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS
    40.
    发明公开
    SYSTEM AND METHOD FOR COMBINED FREQUENCY-DOMAIN AND TIME-DOMAIN PITCH EXTRACTION FOR SPEECH SIGNALS 有权
    系统和方法组合的俯仰角提取的频率范围内和时域语音信号

    公开(公告)号:EP1620844A4

    公开(公告)日:2008-10-08

    申请号:EP04758762

    申请日:2004-03-31

    Applicant: MOTOROLA INC IBM

    CPC classification number: G10L25/90

    Abstract: A system, computer readable medium, and method for sampling a speech signal; dividing the sampled speech signal into overlapped frames; extracting first pitch information from a frame using frequency domain analysis; providing at least one pitch candidate, each being associated with a spectral score, from the first pitch information, each of the at least one pitch candidate representing a possible pitch estimate for the frame; extracting second pitch information from the frame using a time domain analysis; providing a correlation score for the at least one pitch candidate from the second pitch information; and selecting one of the at least one pitch candidate to represent the pitch estimate of the frame. The system, computer readable medium, and method are suitable for speech coding and for distributed speech recognition.

Patent Agency Ranking