Text-to-speech (TTS) processing with transfer of vocal characteristics

Invention Grant

US11410684B1 Text-to-speech (TTS) processing with transfer of vocal characteristics 有权

Please log in to see more content

Patent Title: Text-to-speech (TTS) processing with transfer of vocal characteristics
Application No.: US16430894

Application Date: 2019-06-04
Publication No.: US11410684B1

Publication Date: 2022-08-09
Inventor: Viacheslav Klimkov , Thomas Renaud Drugman , Alexander Galkin , Srikanth Ronanki
Applicant: Amazon Technologies, Inc.
Applicant Address: US WA Seattle
Assignee: Amazon Technologies, Inc.
Current Assignee: Amazon Technologies, Inc.
Current Assignee Address: US WA Seattle
Agency: Pierce Atwood LLP
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L25/78 ; G10L13/027 ; G10L15/16 ; G10L15/187 ; G06F16/38 ; G06N3/08 ; G06N20/20 ; G06F17/18 ; G06N3/04 ; G10L13/04 ; G10L13/033 ; G10L13/07

Text-to-speech (TTS) processing with transfer of vocal characteristics

Abstract:

Audio data from a first, source speaker is received and processed to determine linguistic units and vocal characteristics corresponding to those linguistic units. The linguistic units may either be determined from received text data or may be determined from the audio data using automatic speech recognition. A model is trained using training data from a second, target speaker. The trained model concatenates the linguistic units with the vocal characteristics to produce output speech that has the “voice” of the target speaker and the vocal characteristics of the source speaker.

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统