Device, method and program for signal enhancement, and device, method and program for speech recognition
    51.
    发明专利
    Device, method and program for signal enhancement, and device, method and program for speech recognition 审中-公开
    用于信号增强的装置,方法和程序以及用于语音识别的装置,方法和程序

    公开(公告)号:JP2005249816A

    公开(公告)日:2005-09-15

    申请号:JP2004055812

    申请日:2004-03-01

    CPC classification number: G10L21/0208

    Abstract: PROBLEM TO BE SOLVED: To provide speech enhancement technology which is effective even against sudden noise having no noise section and unknown sudden noise. SOLUTION: A signal enhancement device equipped with spectrum subtracting means 13a, 13b, 15 of subtracting a specified reference signal from an input signal containing a target signal and a noise signal, an adaptive filter 14 which is applied to a reference signal, and a coefficient control means of controlling a filter coefficient of the adaptive filter so as to reduce components of the noise signals of the input signal is provided with a database 16 for a signal model representing a specified quantity of the target signal with a specified statistical model, and controls the filter coefficient according to the likelihood of the signal model to the output signal of the spectrum subtracting means. COPYRIGHT: (C)2005,JPO&NCIPI

    Abstract translation: 要解决的问题:提供即使对于没有噪声段和未知突发噪声的突发噪声也是有效的语音增强技术。 解决方案:一种装备有从包含目标信号和噪声信号的输入信号中减去指定参考信号的频谱减法装置13a,13b,15的信号增强装置,应用于参考信号的自适应滤波器14, 并且具有控制自适应滤波器的滤波器系数以减少输入信号的噪声信号的分量的系数控制装置具有用于表示具有指定的统计模型的指定量的目标信号的信号模型的数据库16 ,并且根据信号模型对频谱减法装置的输出信号的似然性来控制滤波器系数。 版权所有(C)2005,JPO&NCIPI

    MARK INSERTION DEVICE AND ITS METHOD

    公开(公告)号:JP2001083987A

    公开(公告)日:2001-03-30

    申请号:JP24331199

    申请日:1999-08-30

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To insert punctuation marks on suitable positions in a sentence. SOLUTION: An acoustic processing part 20 processes inputted voice data and converts the data into characteristic vectors. When punctuation mark automatic insertion is not executed, a language mark-reproduction part 22 processes the characteristic vectors by using only a versatile language model 320, and inserts a punctuation mark on a part where insertion of a punctuation mark is shown clearly, for example, 'a comma' or the like, by voice data. When the punctuation mark automatic insertion is executed, the language mark- reproduction part 22 discriminates a pause part having no voice as a comma ',' or the like by using the versatile language model 320 and a punctuation language model 322.

    VOICE RECOGNITION EQUIPMENT
    53.
    发明专利

    公开(公告)号:JPH0981186A

    公开(公告)日:1997-03-28

    申请号:JP23689295

    申请日:1995-09-14

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To enable voice recognition in word unit of Japanese. SOLUTION: A user divides a prescribed sentence to words, and corresponding relations between the words by the user and respective form elements of the prescribed sentence are inspected, and the tendency of the word division of the user is judged by the corresponding relations, and a form element group of an example sentence data base is made in word matched with the word division tendency of the user, and a word group is formed, and an N gram 13 and an acoustic model 11 in word are formed using the formed word group, and a voice recognition device is constituted using them.

    METHOD AND APPARATUS FOR SPEECH RECOGNITION

    公开(公告)号:JPH05134695A

    公开(公告)日:1993-05-28

    申请号:JP31157191

    申请日:1991-10-31

    Applicant: IBM

    Abstract: PURPOSE: To open a start end independently of the constitution of a Markov model and to properly suppress the increment of processing quantity by canceling an impossible matching path based on an intermediate likelihood value in each fine interval. CONSTITUTION: When a leading input label 1 in a tone section is obtained, a likelihood value E on a trellis is found out in the vertical direction based on viterbi algorithm within a range allowed by the inclination limit of the model. The operation is applied to a phenonic word model F for all vocabularies to be processed (step 27). In each progress of processing for one frame, vertical maximum likelihood value Emax(i) on the trellis corresponding to an input frame (i) is found out, only values included in a fixed range based on the maximum likelihood value are recorded (steps 28 to 31) to execute the processing of a succeeding input frame. Namely a succeeding frame continues only processing for a word and a state position left on the trellis of the preceding frame.

    SPEECH RECOGNITION DEVICE
    56.
    发明专利

    公开(公告)号:JPH0293597A

    公开(公告)日:1990-04-04

    申请号:JP24450288

    申请日:1988-09-30

    Applicant: IBM JAPAN

    Abstract: PURPOSE:To enable high-accuracy recognition by making the output probability of the identifier of a spectrum variation quantity prototype for recognizing a probability model common among probability models which having the same model identifier regarding spectrum data and a spectrum variation quantity. CONSTITUTION:Markov models in label units having independent label output probability are prepared for individual spectra and spectrum variation quantities. When the parameters of the Markov models are estimated, a switching device 14 is switched to the side of a parameter estimation device 16 for the models and a word base form table 18 where a sequence of label couples is registered is utilized to train the models, thereby determining the parameter value of the parameter table 20. When recognition is carried out, the switching device 14 is switched to the side of the recognition device 17 and an input speech is recognized based upon the sequence of label couples, the base form table 18, and parameter table 19. Consequently, high-accuracy recognition is carried out without increasing a calculation quantity nor storage quantity so much.

    Information processor, information processing method, information processing system and program
    57.
    发明专利
    Information processor, information processing method, information processing system and program 有权
    信息处理器,信息处理方法,信息处理系统和程序

    公开(公告)号:JP2012159596A

    公开(公告)日:2012-08-23

    申请号:JP2011017986

    申请日:2011-01-31

    CPC classification number: G10L15/1807 G10L15/05

    Abstract: PROBLEM TO BE SOLVED: To provide an information processor, an information processing method, an information processing system and a program for analyzing a phrase reflecting information that is not recognized explicitly with words.SOLUTION: An information processor 120 uses voice data recording dialogs to identify information that is not clearly specified with words in the voice data, and comprises: an acoustic analysis unit 208 for execute acoustic analysis of the voice data by using acoustic data; a prosodic information acquisition unit 212 for identifying a region isolated before and after the voice data by a pause, identifying a phrase in the identified region by using the acoustic analysis of the identified region, and generating one or more prosodic feature values with respect to the phrase with setting a prosodic feature value of the phrase as an element; an appearance frequency acquisition unit 210 for acquiring an appearance frequency of the phrase, which is acquired by the acoustic analysis unit 208, in the voice data; and a prosodic variation analysis unit 214 for calculating a variation degree of the prosodic feature value of the phrase with high appearance frequency in the voice data, and determining a feature phrase.

    Abstract translation: 要解决的问题:提供一种信息处理器,信息处理方法,信息处理系统和用于分析反映不能用单词明确识别的信息的短语的程序。 解决方案:信息处理器120使用语音数据记录对话来识别语音数据中没有用字清楚指定的信息,并且包括:声学分析单元208,用于通过使用声学数据来执行语音数据的声学分析; 韵律信息获取单元212,用于通过暂停识别语音数据之前和之后隔离的区域,通过使用所识别的区域的声学分析识别所识别的区域中的短语,并且生成关于所述语音数据的一个或多个韵律特征值 将短语的韵律特征值设置为元素; 出现频率获取单元210,用于获取声音数据中由声学分析单元208获取的短语的出现频率; 以及韵律变异分析单元214,用于计算语音数据中具有高出现频率的短语的韵律特征值的变化程度,以及确定特征短语。 版权所有(C)2012,JPO&INPIT

    Speech collection method, system and program
    58.
    发明专利
    Speech collection method, system and program 有权
    语音收集方法,系统和程序

    公开(公告)号:JP2010026361A

    公开(公告)日:2010-02-04

    申请号:JP2008189504

    申请日:2008-07-23

    Abstract: PROBLEM TO BE SOLVED: To accurately collect speech of only a specified speaker such as a sales person in counter selling or the like. SOLUTION: A speech collection system 10 extracts and collects target speech which is a target in a plurality of pieces of speech in which coming directions are different from each other. The system includes a microphone array 11 including at least first and second microphones 11a and 11b, in which the first and second microphones are arranged by separating them with a predetermined distance. Discrete Fourier transform is performed on each signal of speech received by the first and second microphones, and a plurality of cross spectrum power (CSP) coefficients related to the coming direction of speech are calculated, and a plurality of speech signals are detected from the plurality of CSP coefficients. Then, a speech direction index defined according to an angle between a line for connecting the first and second microphones and the coming direction, is detected from the plurality of calculated CSP coefficients, and the signal of the target speech is extracted from the plurality of speech signals, which are detected from the detected speech direction index. COPYRIGHT: (C)2010,JPO&INPIT

    Abstract translation: 要解决的问题:准确地收集只有指定的演讲者,如销售人员等的销售人员的演讲。 解决方案:语音收集系统10提取和收集作为来自不同方向不同的多个语音中的目标的目标语音。 该系统包括麦克风阵列11,其包括至少第一麦克风11a和第二麦克风11b,其中第一和第二麦克风通过以预定距离分离而布置。 对由第一和第二麦克风接收的每个语音信号执行离散傅立叶变换,并且计算与语音的未来方向相关的多个交叉频谱功率(CSP)系数,并且从多个检测到多个语音信号 的CSP系数。 然后,从多个计算出的CSP系数中检测出根据用于连接第一和第二麦克风的线路之间的角度和来往方向所定义的语音方向索引,并且从多个语音中提取目标语音的信号 信号,其从检测到的语音方向索引检测。 版权所有(C)2010,JPO&INPIT

    Technique for recognizing accent of input voice
    59.
    发明专利
    Technique for recognizing accent of input voice 审中-公开
    识别输入语音的技巧

    公开(公告)号:JP2008134475A

    公开(公告)日:2008-06-12

    申请号:JP2006320890

    申请日:2006-11-28

    CPC classification number: G10L15/04 G10L13/04

    Abstract: PROBLEM TO BE SOLVED: To efficiently and accurately recognize accent of input voice. SOLUTION: Notation data for learning showing notation of each phrase of a text for learning, utterance data for learning showing characteristics of utterance of each phrase, and boundary data for learning showing whether or not each phrase is the boundary of an accent phrase, are stored. The candidate of the boundary data is input, and first likelihood in which the boundary of the accent phrase of each phrase of the input text is coincident with the input candidate, is calculated from input notation data showing notation of the input text for showing the content of the input voice, the notation data for learning, and the boundary data for learning. Second likelihood in which utterance of each phrase of the input text becomes utterance indicated by input utterance data, when the input voice has the boundary of the accent phrase indicated by the candidate of the candidate data, from input utterance data showing characteristics of the utterance of each phrase of the input voice, the utterance data for learning, and the boundary data for learning. The candidate of the boundary data which maximizes a product of the first likelihood and the second likelihood, is searched and the result is output. COPYRIGHT: (C)2008,JPO&INPIT

    Abstract translation: 要解决的问题:高效准确地识别输入声音的重音。

    解决方案:用于学习的记号数据显示用于学习的文本的每个短语的符号,用于学习的学习数据,显示每个短语的发音特征,以及用于学习的边界数据,显示每个短语是否是重音短语的边界 ,被存储。 输入边界数据的候选者,并且输入文本的每个短语的重音短语的边界与输入候选符合的第一可能性是从输入符号数据计算的,该输入符号数据显示用于显示内容的输入文本的符号 的输入声音,用于学习的符号数据和用于学习的边界数据。 当输入语音具有由候选数据的候选者指示的重音短语的边界时,由输入语音数据表示输入文本的每个短语的发音变成话语的第二可能性, 输入声音的每个短语,用于学习的话语数据和用于学习的边界数据。 搜索最大化第一似然率和第二似然度的乘积的边界数据的候选,并输出结果。 版权所有(C)2008,JPO&INPIT

    Device for supporting design of voice interface, method and program therefor
    60.
    发明专利
    Device for supporting design of voice interface, method and program therefor 有权
    支持语音接口设计,方法和程序的设备

    公开(公告)号:JP2008046318A

    公开(公告)日:2008-02-28

    申请号:JP2006221322

    申请日:2006-08-14

    CPC classification number: G10L15/063 G10L15/183

    Abstract: PROBLEM TO BE SOLVED: To provide a device for supporting design of a voice interface that receive a plurality of kinds of voice control, and a program and a method thereof.
    SOLUTION: The device comprises a database for recording speech samples associated with one of the plurality of kinds of voice control; a degree of similarity calculation part for calculating the similarity between a first assembly of the speech sample associated with the first voice control, and a second assembly of the speech sample associated with the second voice control; and a display part for displaying the similarity between the first assembly and the second assembly. The display part preferably displays a graph in which points corresponding to each of the plurality of kinds of the voice control are plotted as that the similarity is expressed.
    COPYRIGHT: (C)2008,JPO&INPIT

    Abstract translation: 要解决的问题:提供一种用于支持接收多种语音控制的语音接口的设计的装置,以及其程序及其方法。 解决方案:该设备包括用于记录与多种语音控制中的一种相关联的语音样本的数据库; 相似度计算部分,用于计算与第一语音控制相关联的语音样本的第一组合与与第二语音控制相关联的语音样本的第二组合之间的相似度; 以及用于显示第一组件和第二组件之间的相似性的显示部件。 显示部件优选地显示图形,其中绘制了与多个种类的语音控件中的每一种相对应的点,因为表示相似性。 版权所有(C)2008,JPO&INPIT

Patent Agency Ranking