System supporting text-to-speech synthesis
    11.
    发明专利
    System supporting text-to-speech synthesis 有权
    系统支持文本语音合成

    公开(公告)号:JP2008046538A

    公开(公告)日:2008-02-28

    申请号:JP2006224110

    申请日:2006-08-21

    CPC classification number: G10L13/04 G10L15/26

    Abstract: PROBLEM TO BE SOLVED: To provide a system for supporting text-to-speech synthesis, capable of efficiently generating high-quality synthetic speech. SOLUTION: The system supports text-to-speech synthesis, and comprises a learning data generating part which recognizes input speech and generates a 1st learning data, to make correspondence between notation and reading of a phrase; a frequency data generating part which creates a frequency data showing the appearance frequency of the notation and reading of the phrase, based on the 1st learning data; and a setting part which sets the frequency data to a language processing part which generates reading method, corresponding to the notation from the text notation, based on the appearance frequency of the reading so as to bring output speech of the text-to-speech synthesis that should be made close to the input speech. COPYRIGHT: (C)2008,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种能够有效地产生高质量合成语音的支持文本到语音合成的系统。 解决方案:该系统支持文本到语音合成,并且包括识别输入语音并产生第一学习数据的学习数据生成部分,以使符号和阅读短语之间的对应关系; 频率数据生成部,其基于所述第1学习数据,生成表示所述符号的出现频度和所述短语的读取的频率数据; 以及设定部,其基于读取的出现频度,将频率数据设定为与文本符号对应的符号生成读取方式的语言处理部,将文字对语音合成的输出语音 应该接近输入的言论。 版权所有(C)2008,JPO&INPIT

    Voice recognition device, its voice recognition method and program
    12.
    发明专利
    Voice recognition device, its voice recognition method and program 有权
    语音识别装置,语音识别方法和程序

    公开(公告)号:JP2003337594A

    公开(公告)日:2003-11-28

    申请号:JP2002272318

    申请日:2002-09-18

    CPC classification number: G10L21/0216 G10L21/028 G10L2021/02166

    Abstract: PROBLEM TO BE SOLVED: To provide a method in which background noise other than the sound source located along an objective direction is efficiently eliminated to realize highly precise voice recognition and to provide a system using the method.
    SOLUTION: An angle distinctive power distribution, that is observed by orienting the directivity of a microphone array toward various sound source directions being considered, is approximated by the sum of coefficient multiples of a reference angle distinctive power distribution that is beforehand measured using reference sound along the objective sound source directions and a reference angle distinctive power distribution of non-directive background sound. Using the above fact in a noise suppressing process section, only the components along the objective sound source direction are extracted. Moreover, when the objective sound source direction is unknown, the objective sound source direction is estimated by selecting the one which minimizes an approximation residue in a sound source location searching section among the reference angle distinctive power distributions along various sound source directions. Furthermore, a maximum liklihood operation is conducted using the voice data of the components along the sound source direction being processed and the voice model which is obtained by making a prescribed model for the voice data and voice recognition is conducted based on the obtained estimated value.
    COPYRIGHT: (C)2004,JPO

    Abstract translation: 要解决的问题:提供一种方法,其中有效消除沿着客观方向设置的声源以外的背景噪声,以实现高精度的语音识别,并提供使用该方法的系统。 解决方案:通过将麦克风阵列的方向性朝向所考虑的各种声源方向观察到的角度不同的功率分布近似于使用预先测量的参考角特征功率分布的系数倍数之和 沿着目标声源方向的参考声音和非指令背景音的参考角度独特的功率分布。 在噪声抑制处理部中使用上述事实,仅提取沿着目标声源方向的分量。 此外,当目标声源方向未知时,通过选择在各种声源方向的参考角度不同的功率分布之中使声源位置搜索部分中的近似残差最小化的目标声源方向来估计目标声源方向。 此外,使用正在处理的声源方向的分量的声音数据进行最大似然运算,并且通过基于获得的估计值进行语音数据和语音识别的规定模型而获得的语音模型。 版权所有(C)2004,JPO

    VOICE RECOGNITION
    13.
    发明专利

    公开(公告)号:JPH01102599A

    公开(公告)日:1989-04-20

    申请号:JP25482187

    申请日:1987-10-12

    Applicant: IBM

    Abstract: PURPOSE: To facilitate adaptation to environment different from that at learning time by converting a speech for adaptation into a label series for adaptation, making it correspond to each state or each state transition of a corresponding Markov model, and finding the values of respective parameters relating to a label group for adaptation. CONSTITUTION: The speech for adaptation is labeled and the correspondence relation between the label series of the speech for previously adaptation and the respective states of the Markov model, estimated by using a large amount of speech data, on a time series is found. On the basis of the correspondence relation, the frequency of correspondence between labels and state transitions is newly counted for all speeches for adaptation and the conditioned probability between the labels and state transitions is estimated from the count. Then this conditioned probability is used to convert parameters of the Markov model which are found previously, thereby estimating new parameters. Consequently, a speech recognition system can be adapted in a short time by using a small amount of data.

    SYSTEM, PROGRAM, AND CONTROL METHOD FOR SPEECH SYNTHESIS

    公开(公告)号:CA2614840C

    公开(公告)日:2016-11-22

    申请号:CA2614840

    申请日:2006-07-10

    Applicant: IBM

    Abstract: The present invention relates to the provision of natural-soundingphonemes and accents for text. There is provided a system that outputs phonemes and accents of texts.The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.

    16.
    发明专利
    未知

    公开(公告)号:DE69010722D1

    公开(公告)日:1994-08-25

    申请号:DE69010722

    申请日:1990-03-07

    Applicant: IBM

    Abstract: The present invention relates to a speech recognition system comprising means (4) for performing a frequency analysis of an input speech in a succession of time periods to obtain feature vectors, means (8) for producing a corresponding label train using a vector quantization code book (9), means (11) for matching a plurality of word baseforms, expressed by a train of Markov models each corresponding to labels, with said label train, means (14) for recognizing the input speech on the basis of the matching result, and means (5, 6, 7, 9) for performing an adaptation operation on the said system to improve its ability to recognise speech. According to the invention, the speech recognition system is characterised in that means for performing an adaptation operation comprises means (4) for dividing each of a plurality of input speech words into N segments (N is an integer number more than 1) and producing a representative value of the feature vector of each segment of each input speech word a means for dividing into segments word baseforms each corresponding to one of said input speech words and for producing a representative value of each segment feature vector of each word baseform on the basis of a prototype vector of the vector quantization code book, means for producing a distance vector indicating the distance between a representative value of each segment of each input speech word and a representative value of the corresponding segment of the corresponding word baseform, means for storing the degree of relation between each segment of each input speech word and each label in a label group of the vector quantization code book; and prototype adaptation means for correcting a prototype vector of each label in the label group of the vector quantization code book by each displacement vector in accordance with the degree of relation between the label and the displacement vector.

    VORRICHTUNG ZUR EXTRAKTION VON SPRACHMERKMALEN,VERFAHREN ZUR EXTRAKTION VON SPRACHMERKMALEN UND PROGRAMM ZUREXTRAKTION VON SPRACHMERKMALEN

    公开(公告)号:DE112010003461T5

    公开(公告)日:2012-07-26

    申请号:DE112010003461

    申请日:2010-07-12

    Applicant: IBM

    Abstract: Eine Technik zum Extrahieren von Merkmalen, die in Bezug auf Störsignale, Mehrfachreflexion und dergleichen robuster sind, wird bereitgestellt. Eine Vorrichtung zur Extraktion von Sprachmerkmalen enthält Differenzberechnungsmittel zum Empfangen eines Spektrums eines Sprachsignals, das in Rahmen segmentiert ist, als eine Eingabe, und zum Berechnen einer Differenz des Spektrums zwischen fortlaufenden Rahmen (eine Differenz in dem linearen Bereich) für jeden Rahmen als ein Delta-Spektrum und Normierungsmittel zum Ausführen einer Normierung des Delta-Spektrums für den Rahmen durch Dividieren des Delta-Spektrums durch eine Funktion eines mittleren Spektrums. Eine Ausgabe der Normierungsmittel ist als ein Delta-Merkmal definiert.

    AUDIO FEATURE EXTRACTING APPARATUS, AUDIO FEATURE EXTRACTING METHOD, AND AUDIO FEATURE EXTRACTING PROGRAM

    公开(公告)号:GB2485926A

    公开(公告)日:2012-05-30

    申请号:GB201202741

    申请日:2010-07-12

    Applicant: IBM

    Abstract: This invention provides a technique for extracting, from audio signals, features that are stronger due to noises and/or reverberations. An audio feature extracting apparatus comprises: difference calculating means operative to receive the spectra of framed audio signals to calculate, as a delta spectrum, the difference in spectrum between each frame and each of the respective preceding and following frames (the difference in linear region); and normalizing means operative to divide the delta spectrum by an average-spectrum function, thereby normalizing the delta spectrum for each frame. The outputs of the normalizing means are used as delta features.

Patent Agency Ranking