Low-dimensional real-time concatenative speech synthesizer

Invention Grant

US10553199B2 Low-dimensional real-time concatenative speech synthesizer 有权

Please log in to see more content

Patent Title: Low-dimensional real-time concatenative speech synthesizer
Application No.: US15570889

Application Date: 2016-05-20
Publication No.: US10553199B2

Publication Date: 2020-02-04
Inventor: Frank Harold Guenther , Alfonso Nieto-Castanon
Applicant: Trustees of Boston University
Applicant Address: US MA Boston
Assignee: Trustees of Boston University
Current Assignee: Trustees of Boston University
Current Assignee Address: US MA Boston
Agency: BainwoodHuang
International Application: PCT/US2016/033525 WO 20160520
International Announcement: WO2016/196041 WO 20161208
Main IPC: G10L13/00
IPC: G10L13/00 ; G10L13/04 ; G10L13/06 ; G10L19/00 ; G10L13/08 ; G10L15/00 ; G06F3/0482 ; G06F3/0484 ; G06F3/048 ; G10L13/027

Low-dimensional real-time concatenative speech synthesizer

Abstract:

A method of providing real-time speech synthesis based on user input includes presenting a graphical user interface having a low-dimensional representation of a multi-dimensional phoneme space, a first dimension representing degree of vocal tract constriction and voicing, a second dimension representing location in a vocal tract. One example employs a disk-shaped layout. User input is received via the interface and translated into a sequence of phonemes that are rendered on an audio output device. Additionally, a synthesis method includes maintaining a library of prerecorded samples of diphones organized into diphone groups, continually receiving a time-stamped sequence of phonemes to be synthesized, and selecting a sequence of diphone groups with their time stamps. A best diphone within each group is identified and placed into a production buffer from which diphones are rendered according to their time stamps.

Public/Granted literature

US20180108342A1 LOW-DIMENSIONAL REAL-TIME CONCATENATIVE SPEECH SYNTHESIZER Public/Granted day:2018-04-19

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统