-
公开(公告)号:US20240346735A1
公开(公告)日:2024-10-17
申请号:US18633750
申请日:2024-04-12
Applicant: University of Rochester
Inventor: Luchuan Song , Chenliang Xu
CPC classification number: G06T13/40 , G06T3/18 , G06T7/20 , G06T7/70 , G06T17/20 , G06V10/44 , G06V40/174 , H04N21/816
Abstract: Features described herein pertain to generative machine learning, and more particularly, to machine learning techniques for generating virtual characters. A video that depicts a first subject and includes an audio component that corresponds to speech spoken by the first subject and an image that depicts a second subject are provided to and used by one or more machine learning models to generate a video that depicts the second subject. The second subject can blink and exhibit emotional characteristic and reactions that are responsive to the speech spoken by the first subject and/or a characteristic of the first subject such as a facial expression and/or head pose motion. The generated video can be displayed and/or stored where it can be later retrieved.
-
公开(公告)号:US20240305944A1
公开(公告)日:2024-09-12
申请号:US18599398
申请日:2024-03-08
Applicant: University of Rochester
Inventor: Chenliang Xu , Chao Huang , Yapeng Tian , FNU Anurag Kumar
CPC classification number: H04S7/302 , G06T7/337 , G06T7/74 , G06T19/00 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , H04S2400/11
Abstract: A localization system may include an image input that receives images from a video source and an audio input that receives, from the video source, audio synchronized with the images. The localization system may also include an audio feature disentanglement network that correlates distinct audio elements from the audio input with corresponding visual features from the image input. Additionally, the localization system may include a geometry-based feature aggregation module that estimates a geometric transformation between two or more images from the video source and aggregates the visual features. Various other devices, systems, and methods are also disclosed.
-