Head mounted multi-sensory audio input system
    1.
    发明公开
    Head mounted multi-sensory audio input system 有权
    Am Kopf angebrachtes Audioeingabesystem mit mehreren Sensoren

    公开(公告)号:EP1503368A1

    公开(公告)日:2005-02-02

    申请号:EP04016226.5

    申请日:2004-07-09

    CPC classification number: H04R1/14 G10L15/20 G10L15/24 G10L25/78 H04R2460/13

    Abstract: The present invention combines a conventional audio microphone with an additional speech sensor that provides a speech sensor signal based on an input. The speech sensor signal is generated based on an action undertaken by a speaker during speech, such as facial movement, bone vibration, throat vibration, throat impedance changes, etc. A speech detector component receives an input from the speech sensor and outputs a speech detection signal indicative of whether a user is speaking. The speech detector generates the speech detection signal based on the microphone signal and the speech sensor signal.

    Abstract translation: 本发明将传统的音频麦克风与基于输入提供语音传感器信号的附加话音传感器相结合。 语音传感器信号基于语音中的扬声器在面部运动,骨骼振动,喉部振动,喉部阻抗变化等中的作用而产生。语音检测器组件从语音传感器接收输入并输出语音检测 指示用户是否在说话的信号。 语音检测器基于麦克风信号和语音传感器信号产生语音检测信号。

    Prosodic databases holding fundamental frequency templates for use in speech synthesis
    2.
    发明公开
    Prosodic databases holding fundamental frequency templates for use in speech synthesis 失效
    含用于语音合成的韵律数据库基频图案

    公开(公告)号:EP0833304A3

    公开(公告)日:1999-03-24

    申请号:EP97114208.8

    申请日:1997-08-18

    CPC classification number: G10L13/08 G10L13/04 G10L15/144 G10L2025/903

    Abstract: Prosodic databases hold fundamental frequency templates for use in a speech synthesis system. Prosodic database templates may hold fundamental frequency values for syllables in a given sentence. These fundamental frequency values may be applied in synthesizing a sentence of speech. The templates are indexed by tonal pattern markings. A predicted tonal marking pattern is generated for each sentence of text that is to be synthesized, and this predicted pattern of tonal markings is used to locate a best matching template. The templates are derived by calculating fundamental frequencies on a pursuable basis for sentences that are spoken by a human trainer for a given unlabeled corpus.

    Extensible speech recognition system that provides a user with audio feedback
    3.
    发明公开
    Extensible speech recognition system that provides a user with audio feedback 失效
    Erweiterbares Spracherkennungssystem mit einer Audio-Rückkopplung

    公开(公告)号:EP1693827A3

    公开(公告)日:2007-05-30

    申请号:EP06010060.9

    申请日:1998-04-08

    CPC classification number: G10L15/063 G10L2015/0638

    Abstract: A speech recognition system (36) is extensible in that new terms may be added to a list (42) of terms that are recognized by the speech recognition system. The speech recognition system provides audio feedback when new terms are added so that a user may hear how the system expects the word to be pronounced. The user may then accept the pronunciation or provide his own pronunciation. The user may also selectively change the pronunciation of words to avoid misrecognitions by the system. The system may provide appropriate user interface elements for enabling a user to change the pronunciation of words. The system may also include intelligence for automatically changing the pronunciation of words used in recognition based upon empirically derived information.

    Abstract translation: 语音识别系统(36)是可扩展的,因为可以将新术语添加到由语音识别系统识别的术语列表(42)中。 当添加新术语时,语音识别系统提供音频反馈,使得用户可以听到系统如何预期该单词被发音。 然后用户可以接受发音或提供他自己的发音。 用户还可以选择性地改变单词的发音,以避免系统误识别。 系统可以提供适当的用户界面元素,以使用户能够改变单词的发音。 该系统还可以包括基于经验导出的信息来自动改变在识别中使用的单词的发音的智能。

    Method and system for selecting alternative words during speech recognition
    5.
    发明公开
    Method and system for selecting alternative words during speech recognition 失效
    方法与系统的语音识别过程中选择备选单词

    公开(公告)号:EP0840289A3

    公开(公告)日:1999-05-06

    申请号:EP97118377.7

    申请日:1997-10-22

    CPC classification number: G10L15/22

    Abstract: A method and system for editing words that have been misrecognized. The system allows a speaker to specify a number of alternative words to be displayed in a correction window by resizing the correction window. The system also displays the words in the correction window in alphabetical order. A preferred system eliminates the possibility, when a misrecognized word is respoken, that the respoken utterance will be again recognized as the same misrecognized word. The system, when operating with a word processor, allows the speaker to specify the amount of speech that is buffered before transferring to the word processor.

    Speech recognition method based on word duration modelling
    7.
    发明公开
    Speech recognition method based on word duration modelling 有权
    由字持续时间的建模装置的语音识别的方法

    公开(公告)号:EP1610301A3

    公开(公告)日:2006-03-15

    申请号:EP05077070.0

    申请日:1998-09-16

    CPC classification number: G10L15/08 G10L15/05

    Abstract: A method of recognizing speech, comprising:

    receiving input data indicative of the speech to be recognized;
    detecting pauses in the speech, based on the input data, to identify a phrase duration;
    generating a plurality of phrase hypotheses representative of likely word phrases represented by the input data between the pauses detected;
    comparing a word duration associated with each word in each phrase hypothesis, based on a number of words in the phrase hypothesis and based on the phrase duration, with an expected word duration for a phrase having a number of words equal to the number of words in the phrase hypothesis; and
    assigning a score to each phrase hypothesis based on the comparison of the word duration with the expected word duration to obtain a most likely phrase hypothesis represented by the input data.

    Method and system of runtime acoustic unit selection for speech synthesis
    8.
    发明公开
    Method and system of runtime acoustic unit selection for speech synthesis 失效
    方法和系统在运行时的声单元用于语音合成的选择

    公开(公告)号:EP0805433A2

    公开(公告)日:1997-11-05

    申请号:EP97107115.4

    申请日:1997-04-29

    CPC classification number: G10L13/07

    Abstract: The present invention pertains to a concatenative speech synthesis system and method which produces a more natural sounding speech. The system provides for multiple instances of each acoustic unit which can be used to generate a speech waveform representing an linguistic expression. The multiple instances are formed during an analysis or training phase of the synthesis process and are limited to a robust representation of the highest probability instances. The provision of multiple instances enables the synthesizer to select the instance which closely resembles the desired instance thereby eliminating the need to alter the stored instance to match the desired instance. This in essence minimizes the spectral distortion between the boundaries of adjacent instances thereby producing more natural sounding speech.

    Abstract translation: 本发明涉及一种衔接语音合成系统和方法,其产生更自然的发声语音。 该系统提供了哪些可用于产生在语言表达表示语音波形的每个声频单元的多个实例。 的多个实例在合成过程的分析或训练阶段期间形成并且被限制在最高概率实例的健壮表示。 多个实例的设置使得合成器来选择非常类似于所需的实例因此不需要改变存储的实例,以匹配所需的实例的实例。 这在本质上最小化的相邻实例,从而产生更自然的发声语音的边界之间的频谱失真。

    Speech recognition method based on word duration modelling
    9.
    发明公开
    Speech recognition method based on word duration modelling 有权
    Verfahren zur Spracherkennung mittels Modellierung der Wortdauer

    公开(公告)号:EP1610301A2

    公开(公告)日:2005-12-28

    申请号:EP05077070.0

    申请日:1998-09-16

    CPC classification number: G10L15/08 G10L15/05

    Abstract: A method of recognizing speech, comprising:

    receiving input data indicative of the speech to be recognized;
    detecting pauses in the speech, based on the input data, to identify a phrase duration;
    generating a plurality of phrase hypotheses representative of likely word phrases represented by the input data between the pauses detected;
    comparing a word duration associated with each word in each phrase hypothesis, based on a number of words in the phrase hypothesis and based on the phrase duration, with an expected word duration for a phrase having a number of words equal to the number of words in the phrase hypothesis; and
    assigning a score to each phrase hypothesis based on the comparison of the word duration with the expected word duration to obtain a most likely phrase hypothesis represented by the input data.

    Abstract translation: 一种识别语音的方法,包括:接收表示将被识别的语音的输入数据; 基于输入数据检测语音中的暂停,以识别短语持续时间; 产生表示由所检测的暂停之间的输入数据表示的可能词语短语的多个短语假设; 基于短语假设中的单词的数量和短语持续时间来比较与每个单词假设中的每个单词相关联的单词持续时间,其中短语的期望单词持续时间具有等于词数 短语假设; 并且基于词语持续时间与预期词语持续时间的比较来为每个短语假设分配得分以获得由输入数据表示的最可能的短语假设。

Patent Agency Ranking