Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks

Invention Grant

US09697820B2 Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks 有权

Please log in to see more content

Patent Title: Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
Application No.: US14961370

Application Date: 2015-12-07
Publication No.: US09697820B2

Publication Date: 2017-07-04
Inventor: Woojay Jeon
Applicant: Apple Inc.
Applicant Address: US CA Cupertino
Assignee: Apple Inc.
Current Assignee: Apple Inc.
Current Assignee Address: US CA Cupertino
Agency: Morrison & Foerster LLP
Main IPC: G10L13/08
IPC: G10L13/08 ; G10L13/07 ; G10L13/047

Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks

Abstract:

Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.

Public/Granted literature

US20170092259A1 UNIT-SELECTION TEXT-TO-SPEECH SYNTHESIS USING CONCATENATION-SENSITIVE NEURAL NETWORKS Public/Granted day:2017-03-30

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/08	.文本分析或文本以外的语音合成参数的产生，例如语义图翻译为音素、韵律产生、重音或声调测定