Duration informed attention network (DURIAN) for audio-visual synthesis

Invention Grant

US11151979B2 Duration informed attention network (DURIAN) for audio-visual synthesis 有权

Please log in to see more content

Patent Title: Duration informed attention network (DURIAN) for audio-visual synthesis
Application No.: US16549068

Application Date: 2019-08-23
Publication No.: US11151979B2

Publication Date: 2021-10-19
Inventor: Heng Lu , Chengzhu Yu , Dong Yu
Applicant: TENCENT AMERICA LLC
Applicant Address: US CA Palo Alto
Assignee: TENCENT AMERICA LLC
Current Assignee: TENCENT AMERICA LLC
Current Assignee Address: US CA Palo Alto
Agency: Sughrue Mion, PLLC
Main IPC: G10L13/08
IPC: G10L13/08 ; G10L13/027 ; G10L13/02 ; G10L13/033 ; G10L13/10 ; G10L19/03 ; G06T13/40 ; G10L19/00 ; G10L13/00

Duration informed attention network (DURIAN) for audio-visual synthesis

Abstract:

A method and apparatus include receiving a text input that includes a sequence of text components. Respective temporal durations of the text components are determined using a duration model. A spectrogram frame is generated based on the duration model. An audio waveform is generated based on the spectrogram frame. Video information is generated based on the audio waveform. The audio waveform is provided as an output along with a corresponding video.

Public/Granted literature

US20210056949A1 DURATION INFORMED ATTENTION NETWORK (DURIAN) FOR AUDIO-VISUAL SYNTHESIS Public/Granted day:2021-02-25

Information query

Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L13/00	语音合成；文本-语音合成系统
G10L13/08	.文本分析或文本以外的语音合成参数的产生，例如语义图翻译为音素、韵律产生、重音或声调测定