Invention Grant
US07844463B2 Method and system for aligning natural and synthetic video to speech synthesis
有权
将自然和合成视频与语音合成对齐的方法和系统
- Patent Title: Method and system for aligning natural and synthetic video to speech synthesis
- Patent Title (中): 将自然和合成视频与语音合成对齐的方法和系统
-
Application No.: US12193397Application Date: 2008-08-18
-
Publication No.: US07844463B2Publication Date: 2010-11-30
- Inventor: Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
- Applicant: Andrea Basso , Mark Charles Beutnagel , Joern Ostermann
- Applicant Address: US NY New York
- Assignee: AT&T Intellectual Property II, L.P.
- Current Assignee: AT&T Intellectual Property II, L.P.
- Current Assignee Address: US NY New York
- Main IPC: G10L13/00
- IPC: G10L13/00 ; G06T13/00

Abstract:
According to MPEG-4's TTS architecture, facial animation can be driven by two streams simultaneously—text and Facial Animation Parameters. A Text-To-Speech converter drives the mouth shapes of the face. An encoder sends Facial Animation Parameters to the face. The text input can include codes, or bookmarks, transmitted to the Text-to-Speech converter, which are placed between and inside words. The bookmarks carry an encoder time stamp. Due to the nature of text-to-speech conversion, the encoder time stamp does not relate to real-world time, and should be interpreted as a counter. The Facial Animation Parameter stream carries the same encoder time stamp found in the bookmark of the text. The system reads the bookmark and provides the encoder time stamp and a real-time time stamp. The facial animation system associates the correct facial animation parameter with the real-time time stamp using the encoder time stamp of the bookmark as a reference.
Public/Granted literature
- US20080312930A1 METHOD AND SYSTEM FOR ALIGNING NATURAL AND SYNTHETIC VIDEO TO SPEECH SYNTHESIS Public/Granted day:2008-12-18
Information query