Speaker-adaptive synthesized voice

Invention Grant

US08744853B2 Speaker-adaptive synthesized voice 有权

Title translation: 扬声器自适应合成语音

Please log in to see more content

Patent Title: Speaker-adaptive synthesized voice
Patent Title (中): 扬声器自适应合成语音
Application No.: US13319856

Application Date: 2010-03-16
Publication No.: US08744853B2

Publication Date: 2014-06-03
Inventor: Masafumi Nishimura , Ryuki Tachibana
Applicant: Masafumi Nishimura , Ryuki Tachibana
Applicant Address: US NY Armonk
Assignee: International Business Machines Corporation
Current Assignee: International Business Machines Corporation
Current Assignee Address: US NY Armonk
Agency: Fleit Gibbons Gutman Bongini & Bianco PL
Agent Jon A. Gibbons
Priority: JP2009-129366 20090528
International Application: PCT/JP2010/054413 WO 20100316
International Announcement: WO2010/137385 WO 20101202
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L15/00 ; G10L15/06

Abstract:

An objective is to provide a technique for accurately reproducing features of a fundamental frequency of a target-speaker's voice on the basis of only a small amount of learning data. A learning apparatus learns shift amounts from a reference source F0 pattern to a target F0 pattern of a target-speaker's voice. The learning apparatus associates a source F0 pattern of a learning text to a target F0 pattern of the same learning text by associating their peaks and troughs. For each of points on the target F0 pattern, the learning apparatus obtains shift amounts in a time-axis direction and in a frequency-axis direction from a corresponding point on the source F0 pattern in reference to a result of the association, and learns a decision tree using, as an input feature vector, linguistic information obtained by parsing the learning text, and using, as an output feature vector, the calculated shift amounts.

Abstract(Chinese):

目的是提供一种用于仅基于少量学习数据准确地再现目标扬声器的声音的基频的特征的技术。学习装置学习从参考源F0模式到目标讲者的语音的目标F0模式的移位量。学习装置将学习文本的源F0模式与相同学习文本的目标F0模式相关联，使其峰和谷相关联。对于目标F0模式上的每个点，学习装置参照关联结果从源F0模式上的相应点获得时间轴方向和频率轴方向上的移位量，并且学习一个使用通过解析学习文本获得的语言信息作为输入特征向量，并且使用计算的移位量作为输出特征向量来使用决策树。

Public/Granted literature

US20120059654A1 SPEAKER-ADAPTIVE SYNTHESIZED VOICE Public/Granted day:2012-03-08

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统