Patent search ap:("IBM") AND inv:"NISHIMURA MASAFUMI" Page 1

1.

发明公开
INFORMATION PROCESSING DEVICE, METHOD, AND PROGRAM FOR OBTAINING WEIGHT PER FEATURE VALUE IN SUBJECTIVE HIERARCHICAL CLUSTERING 审中-公开
Title translation: 信息处理设备，方法和程序获取功能，曝光值在主观层次聚类

公开(公告)号：EP2728518A4

公开(公告)日：2016-07-06

申请号：EP12804076

申请日：2012-04-13

Applicant: IBM

Inventor： TACHIBANA RYUKI , NAGANO TOHRU , NISHIMURA MASAFUMI , TAKASHIMA RYOICHI

IPC: G06N3/00 , G06F17/30 , G06N99/00 , G10L17/26

CPC classification number: G06N99/005 , G06F17/3002 , G10L17/26

2.

发明公开
DEVICE FOR LEARNING AMOUNT OF MOVEMENT OF BASIC FREQUENCY FOR ADAPTING TO SPEAKER, BASIC FREQUENCY GENERATION DEVICE, AMOUNT OF MOVEMENT LEARNING METHOD, BASIC FREQUENCY GENERATION METHOD, AND AMOUNT OF MOVEMENT LEARNING PROGRAM 有权
Title translation: DEVICE输电基准利率招式，以适应扬声器，器具，用于产生基本频率，发射基站汇率波动的影响方法，用于生产基地汇率变动和程序来传达这种运动

公开(公告)号：EP2357646A4

公开(公告)日：2012-11-21

申请号：EP10780343

申请日：2010-03-16

Applicant: IBM

Inventor： TACHIBANA RYUKI , NISHIMURA MASAFUMI

IPC: G10L13/08 , G10L13/02 , G10L21/00 , G10L21/013

CPC classification number: G10L13/02 , G10L2021/0135

Abstract: An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.

3.

发明专利
SPEECH FEATURE EXTRACTING APPARATUS, SPEECH FEATURE EXTRACTING METHOD, AND SPEECH FEATURE EXTRACTING PROGRAM 未知

公开(公告)号：GB2485926B

公开(公告)日：2013-06-05

申请号：GB201202741

申请日：2010-07-12

Applicant: IBM

Inventor： FUKUDA TAKASHI , ICHIKAWA OSAMU , NISHIMURA MASAFUMI

IPC: G10L15/02 , G10L25/24

Abstract: A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.

4.

发明专利
未知

公开(公告)号：DE69324428D1

公开(公告)日：1999-05-20

申请号：DE69324428

申请日：1993-09-28

Applicant: IBM

Inventor： NISHIMURA MASAFUMI , OKOCHI MASAAKI

IPC: G10L15/06 , G10L15/02 , G10L15/14 , G10L15/18 , G10L5/06 , G10L7/08 , G10L9/06

5.

发明专利
Vorrichtung zur Extraktion von Sprachmerkmalen, Verfahren zur Extraktion von Sprachmerkmalen und Programm zur Extraktion von Sprachmerkmalen 未知

公开(公告)号：DE112010003461B4

公开(公告)日：2019-09-05

申请号：DE112010003461

申请日：2010-07-12

Applicant: IBM

Inventor： ICHIKAWA OSAMU , FUKUDA TAKASHI , NISHIMURA MASAFUMI

IPC: G10L21/02 , G10L25/24

Abstract: Vorrichtung zur Extraktion von Sprachmerkmalen, wobei die Vorrichtung Folgendes umfasst:eine erste Differenzberechnungseinheit (600, 700, 800) zum Empfangen eines Spektrums für jede einer Mehrzahl von Frequenzgruppen eines Sprachsignals, wobei das Sprachsignal für jede Frequenzgruppe in Rahmen segmentiert ist, und zum Berechnen, für jeden Rahmen jeder Frequenzgruppe, einer Differenz des Spektrums zwischen fortlaufenden Rahmen für die Frequenzgruppe als ein Delta-Spektrum; undeine erste Normierungseinheit (605, 710, 810) zum Ausführen einer Normierung des Delta-Spektrums für jeden Rahmen jeder Frequenzgruppe durch Dividieren des Delta-Spektrums durch eine Funktion des mittleren Spektrums, welches durch einen Mittelwert von Spektren über alle Sprache darstellenden Rahmen gegeben ist.

6.

发明专利
SYSTEM, PROGRAM, AND CONTROL METHOD FOR SPEECH SYNTHESIS 未知

公开(公告)号：CA2614840A1

公开(公告)日：2007-01-18

申请号：CA2614840

申请日：2006-07-10

Applicant: IBM

Inventor： NISHIMURA MASAFUMI , MORI SHINSUKE , NEGANO TORU

IPC: G10L13/08

Abstract: The present invention relates to the provision of natural-soundingphonemes and accents for text. There is provided a system that outputs phonemes and accents of texts.The system has a storage section storing a first corpus in which spellings, phonemes, and accents of a text input beforehand are recorded separately for individual segmentations of the words that are contained in the text. A text for which phonemes and accents are to be output is acquired and the first corpus is searched to retrieve at least one set of spellings that match the spellings in the text from among sets of contiguous spellings. Then, the combination of a phoneme and an accent that has a higher probability of occurrence in the first corpus than a predetermined reference probability is selected as the phonemes and accent of the text.

7.

发明专利
VOICE RECOGNITION APPARATUS 未知

公开(公告)号：CA1336458C

公开(公告)日：1995-07-25

申请号：CA612649

申请日：1989-09-22

Applicant: IBM

Inventor： NISHIMURA MASAFUMI

IPC: G10L11/00 , G10L15/02 , G10L15/10 , G10L15/14 , G10L5/06

Abstract: The invention independently vector-quantizes the spectrum representing the static feature of speech on the frequency axis and the variation pattern of the spectrum on the time axis. The resultant pair of label trains are evaluated, based on the knowledge that there is a small correlation between them, by the equation: P(La, Lc?W) = P(La, Lc?I,W)P(I?W) I = P(La(1)?Ma(i1)P(Lc(1)?Mc(i1)) I P(Bi1,i2?Ma(i1),?Mc(i1)) P(La(2)?Ma(i2))P(Lc(2)?M(i2)) P(Bi2,i3 ?Ma(i2), Mc(i2) ...La(T)?Ma(iT))P(Lc(T)?Mc(iT)) P(BiT, iT+1?Ma(it), Mc(iT)) wherein W designated a Markov model representing a word; I = i1, I2, I3, ... iT, a state train; Ma and Mc, Markov models by label corresponding to the spectrum and the spectrum variation, respectively; and B , a transition from the state i to the scale j. P(La, Lc?W) is calculated for each Markov model W representing a word and W giving the maximum value for it is determined as the recognition result.

8.

发明专利
Voice recognition system and method 有权
Title translation: 语音识别系统和方法

公开(公告)号：JP2010139963A

公开(公告)日：2010-06-24

申请号：JP2008318403

申请日：2008-12-15

Applicant: Internatl Business Mach Corp , インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation

Inventor： KURATA TAKEHITO , ITO NOBUYASU , NISHIMURA MASAFUMI

IPC: G10L15/06 , G10L15/18

Abstract: PROBLEM TO BE SOLVED: To provide a practical system etc. for voice recognition, in which recognition performance is improved by considering utterance variation.
SOLUTION: The system includes a voice recognition device 200 and a pre-processor 100 for creating a recognition graph used for voice recognition processing by the voice recognition device 200. The pre-processor 100 comprises: a language model estimation section 110 for estimating a language model; a recognition word dictionary section 130 holding corresponding information to a word, a phoneme string just in the same description as in the word, and to information on the phoneme string in which utterance variation is described; and a recognition graph creating section 140 for creating a recognition graph on the basis of a language model estimated by a language model estimation section 110, and the correspondence information held by the recognition word dictionary section 130 regarding the word included in the language model. The recognition graph creating section 140 creates the recognition graph by applying the phoneme string considering utterance variation regarding the word with respect to the word included in a word string composed of more than a fixed number of words.
COPYRIGHT: (C)2010,JPO&INPIT

Abstract translation: 要解决的问题：提供语音识别的实用系统等，其中通过考虑话语变化来提高识别性能。解决方案：该系统包括用于创建用于由语音识别装置200进行语音识别处理的识别图形的语音识别装置200和预处理器100.预处理器100包括：语言模型估计部分110，用于估计语言模型; 将对应的信息保存到单词的识别词典部分130，与该单词相同的描述中的音素串，以及描述话音变化的音素串的信息; 以及用于基于由语言模型估计部分110估计的语言模型创建识别图形的识别图形创建部分140，以及由识别词典词典部分130保持的关于语言模型中包含的词语的对应信息。识别图形创建部分140通过应用音素串来考虑与包含在由多于固定数量的单词组成的单词串中的单词相关的单词的发音变化来应用音素串来创建识别图。版权所有（C）2010，JPO＆INPIT

9.

发明专利
Voice activity detection system, method and program 有权
Title translation: 语音活动检测系统，方法和程序

公开(公告)号：JP2009210617A

公开(公告)日：2009-09-17

申请号：JP2008050537

申请日：2008-02-29

Applicant: Internatl Business Mach Corp , インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation

Inventor： FUKUDA TAKASHI , ICHIKAWA OSAMU , NISHIMURA MASAFUMI

IPC: G10L11/02 , G10L11/00 , G10L15/04

CPC classification number: G10L25/93

Abstract: PROBLEM TO BE SOLVED: To provide a highly accurate voice activity detection method in a low S/N environment.
SOLUTION: The voice activity is performed by extracting a long-term spectrum variation component and a harmonic structure as feature vectors from a speech signal and increasing difference in feature vectors between speech and non-speech included in the speech signal by using the long-term spectrum variation component feature, or a long-term spectrum variation component extraction and a harmonic structure feature extraction. A correct rate and an accuracy rate of the voice activity detection is improved over conventional methods by using a long-term spectrum variation component having a window length over an average phoneme duration of an utterance in the speech signal. The voice activity detection system and method provides speech processing, automatic speech recognition, and speech output capable of very accurate voice activity detection.
COPYRIGHT: (C)2009,JPO&INPIT

Abstract translation: 要解决的问题：在低S / N环境中提供高精度的语音活动检测方法。解决方案：通过从语音信号中提取长期频谱变化分量和谐波结构作为特征向量并且通过使用语音信号增加语音信号中包括的语音和非语音之间的特征向量的差异来执行语音活动长期光谱变化分量特征，或长期光谱变化分量提取和谐波结构特征提取。通过使用具有在语音信号中的话语的平均音素持续时间上的窗口长度的长期频谱变化分量，语音活动检测的正确率和准确率比常规方法得到改进。语音活动检测系统和方法提供能够进行非常精确的语音活动检测的语音处理，自动语音识别和语音输出。版权所有（C）2009，JPO＆INPIT

10.

发明专利
Technology for creating high quality synthesis voice 审中-公开
Title translation: 创造高品质合成语音技术

公开(公告)号：JP2008185805A

公开(公告)日：2008-08-14

申请号：JP2007019433

申请日：2007-01-30

Applicant: Internatl Business Mach Corp , インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation

Inventor： TACHIBANA TAKATERU , NAGANO TORU , NISHIMURA MASAFUMI

IPC: G10L13/08 , G10L13/02

CPC classification number: G10L13/07

Abstract: PROBLEM TO BE SOLVED: To efficiently create high quality synthesis voice by connecting a plurality of phonemes.
SOLUTION: A system comprises: a phoneme storage section for storing a plurality of phoneme data; a synthesis section for creating a voice data which indicates synthesis voice of a text by reading and connecting a phoneme data corresponding to each phoneme, which indicates pronunciation of the input text, from the phoneme storage section; a calculation section for calculating an index value which indicates unnaturalness of the synthesis voice of the text, based on the voice data; a paraphrase storage section for storing a second notation which is paraphrasing of a first notation by relating it to each of the plurality of first notations; a replacing section for replacing the searched notation with the second notation corresponding to the first notation, by searching notation which corresponds to any of the first notation from the text; and a determination section in which the created voice data is output on condition that the calculated index value is smaller than a reference value, and in which the text is input to the synthesis section so that the voice data of the replaced text may be further created, on condition that the index value is the reference value or more.
COPYRIGHT: (C)2008,JPO&INPIT

Abstract translation: 要解决的问题：通过连接多个音素来有效地创建高质量的合成声音。解决方案：系统包括：音素存储部分，用于存储多个音素数据; 合成部分，用于通过从音素存储部分读取和连接指示对应于每个音素的音素数据来指示文本的综合语音，该语音数据指示输入文本的发音; 计算部分，用于基于语音数据计算表示文本的合成语音的不自然度的指标值; 一个释义存储部分，用于存储通过将其与多个第一符号中的每一个相关联而将第一符号改写为第二符号; 替换部分，用与第一符号相对应的第二符号替换搜索到的符号，通过从文本中搜索对应于任何第一符号的符号; 以及确定部分，其中在所计算的索引值小于参考值的条件下输出创建的语音数据，并且其中文本被输入到合成部分，使得可以进一步创建替换的文本的语音数据，条件是指标值为参考值或更多。版权所有（C）2008，JPO＆INPIT

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification