-
公开(公告)号:US20240257801A1
公开(公告)日:2024-08-01
申请号:US18393575
申请日:2023-12-21
Applicant: SRI International
Inventor: Jeffrey LUBIN , Alexander ERDMANN , James BERGEN , Harry BRATT , Jihua HUANG , Sarah BAKST , Michael LOMNITZ , Zachary DANIELS , John CADIGAN , Ali CHAUDHRY , Zhiwei ZHU , Joshua CHATTIN , Girish ACHARYA
CPC classification number: G10L15/1807 , G10L15/02 , G10L15/063 , G10L15/183 , G10L15/25 , G10L25/18
Abstract: A method, apparatus, and system for creating a script for rendering audio and/or video streams include identifying at least one prosodic speech feature in a received audio stream and/or a received language model, creating a respective prosodic speech symbol for each of the at least one identified prosodic speech features, converting the received audio stream and/or the received language model into a text stream, temporally inserting the created at least one prosodic speech symbol into the text stream, identifying in a received video stream at least one prosodic gesture of at least a portion of a body of a speaker of the received audio stream, creating at least one respective gesture symbol for each of the at least one identified prosodic gestures, and temporally inserting the created at least one gesture symbol into the text stream along with the at least one prosodic speech symbol to create a prosodic script.