Patent search ap:("University of Rochester") AND inv:"Chenliang Xu" Page 1

1.

发明公开
SYSTEM AND METHOD FOR GENERATING VIDEOS DEPICTING VIRTUAL CHARACTERS 审中-公开

公开(公告)号：US20240346735A1

公开(公告)日：2024-10-17

申请号：US18633750

申请日：2024-04-12

Applicant: University of Rochester

Inventor： Luchuan Song , Chenliang Xu

IPC: G06T13/40 , G06T3/18 , G06T7/20 , G06T7/70 , G06T17/20 , G06V10/44 , G06V40/16 , H04N21/81

CPC classification number: G06T13/40 , G06T3/18 , G06T7/20 , G06T7/70 , G06T17/20 , G06V10/44 , G06V40/174 , H04N21/816

Abstract: Features described herein pertain to generative machine learning, and more particularly, to machine learning techniques for generating virtual characters. A video that depicts a first subject and includes an audio component that corresponds to speech spoken by the first subject and an image that depicts a second subject are provided to and used by one or more machine learning models to generate a video that depicts the second subject. The second subject can blink and exhibit emotional characteristic and reactions that are responsive to the speech spoken by the first subject and/or a characteristic of the first subject such as a facial expression and/or head pose motion. The generated video can be displayed and/or stored where it can be later retrieved.

2.

发明公开
FIRST-PERSON AUDIO-VISUAL OBJECT LOCALIZATION SYSTEMS AND METHODS 审中-公开

公开(公告)号：US20240305944A1

公开(公告)日：2024-09-12

申请号：US18599398

申请日：2024-03-08

Applicant: University of Rochester

Inventor： Chenliang Xu , Chao Huang , Yapeng Tian , FNU Anurag Kumar

IPC: H04S7/00 , G06T7/33 , G06T7/73 , G06T19/00

CPC classification number: H04S7/302 , G06T7/337 , G06T7/74 , G06T19/00 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , H04S2400/11

Abstract: A localization system may include an image input that receives images from a video source and an audio input that receives, from the video source, audio synchronized with the images. The localization system may also include an audio feature disentanglement network that correlates distinct audio elements from the audio input with corresponding visual features from the image input. Additionally, the localization system may include a geometry-based feature aggregation module that estimates a geometric transformation between two or more images from the video source and aggregates the visual features. Various other devices, systems, and methods are also disclosed.

Patent Agency Ranking