METHOD OF TRACEBACK MATRIX STORAGE IN A SPEECH RECOGNITION SYSTEM
    4.
    发明申请
    METHOD OF TRACEBACK MATRIX STORAGE IN A SPEECH RECOGNITION SYSTEM 审中-公开
    语音识别系统中跟踪矩阵存储的方法

    公开(公告)号:WO0184534A3

    公开(公告)日:2002-02-28

    申请号:PCT/US0111965

    申请日:2001-04-12

    Applicant: MOTOROLA INC

    CPC classification number: G10L15/08

    Abstract: A device (100) including a voice recognition system (204, 206, 207, 208) that generates a signal representative of the speech utterance. The utterance is divided into frames (Ft) representative of the utterance. Frames are allocated to states (S1-S5) using an alignment algorithm. A path representing frame to state allocations is stored in memory (110) using state transition types identifying a state transition to each state. Lattice traceback information for the voice recognition system is stored and updated by generating a traceback array having a plurality of rows and one or more columns, with each row of the plurality of rows corresponding to one of a plurality of states in which a traceback path terminates, and each column containing one or more dwell counts for states in the traceback path. An optimal state transition path into a given state of the plurality of states is determined, and the generated traceback array is updated in response to the determined optimal state transition path.

    Abstract translation: 一种包括语音识别系统(204,206,207,208)的设备(100),其产生代表语音话语的信号。 话语被分为表示话语的帧(Ft)。 使用对准算法将帧分配给状态(S1-S5)。 表示帧到状态分配的路径使用标识到每个状态的状态转换的状态转换类型来存储在存储器(110)中。 通过生成具有多个行和一个或多个列的追溯阵列来存储和更新用于语音识别系统的格子回溯信息,其中多个行中的每一行对应于回溯路径终止的多个状态之一 ,并且每列包含用于回溯路径中的状态的一个或多个驻留计数。 确定进入多个状态的给定状态的最佳状态转换路径,并且响应于所确定的最佳状态转换路径来更新生成的追溯阵列。

    METHOD OF TRACEBACK MATRIX STORAGE IN A SPEECH RECOGNITION SYSTEM
    5.
    发明公开
    METHOD OF TRACEBACK MATRIX STORAGE IN A SPEECH RECOGNITION SYSTEM 审中-公开
    方法回溯MATRIX STORAGE IN语音识别系统

    公开(公告)号:EP1297525A4

    公开(公告)日:2005-09-28

    申请号:EP01928488

    申请日:2001-04-12

    Applicant: MOTOROLA INC

    CPC classification number: G10L15/08

    Abstract: A device (100) including a voice recognition system (204, 206, 207, 208) that generates a signal representative of the speech utterance. The utterance is divided into frames (Ft) representative of the utterance. Frames are allocated to states (S1-S5) using an alignment algorithm. A path representing frame to state allocations is stored in memory (110) using state transition types identifying a state transition to each state. Lattice traceback information for the voice recognition system is stored and updated by generating a traceback array having a plurality of rows and one or more columns, with each row of the plurality of rows corresponding to one of a plurality of states in which a traceback path terminates, and each column containing one or more dwell counts for states in the traceback path. An optimal state transition path into a given state of the plurality of states is determined, and the generated traceback array is updated in response to the determined optimal state transition path.

    METHOD AND APPARATUS FOR SPEECH RECONSTRUCTION IN A DISTRIBUTED SPEECH RECOGNITION SYSTEM
    6.
    发明公开
    METHOD AND APPARATUS FOR SPEECH RECONSTRUCTION IN A DISTRIBUTED SPEECH RECOGNITION SYSTEM 有权
    方法和设备语音重建分布式语音识别系统

    公开(公告)号:EP1395978A4

    公开(公告)日:2005-09-21

    申请号:EP02709089

    申请日:2002-01-18

    Applicant: MOTOROLA INC

    CPC classification number: G10L15/30 G10L19/00 G10L19/093 G10L25/18

    Abstract: A method of reconstructing speech input at a communication device comprises receiving, at the communication device, encoded data that includes encoded spectral data and encoded energy data of the speech input, the encoded spectral data being encoded as a series of mel-frequency cepstral coefficients. The method further comprises decoding, at the communication device, the encoded spectral data and encoded energy data to determine the spectral data and energy data, wherein decoding comprises: performing an inverse discrete cosine transform on the mel-frequency cepstral coefficients at harmonic mel-frequencies corresponding to a pitch period of the speech input to determine log-spectral magnitudes of the speech input at the harmonic mel-frequencies, and exponentiating the log-spectral magnitudes to determine the spectral magnitudes of the speech input. The method also comprises combining the spectral data and energy data to reconstruct the speech input at the communication device. A communication device for use in distributed speech recognition system is also disclosed.

Patent Agency Ranking