Identifying representative frames in video content
Abstract:
One embodiment of the present invention sets forth a technique for selecting a frame of video content that is representative of a media title. The technique includes applying an embedding model to a plurality of faces included in a set of frames of the video content to generate a plurality of face embeddings. The technique also includes aggregating the plurality of face embeddings into a plurality of clusters representing a plurality of characters included in the media title. The technique further includes computing a plurality of prominence scores for the plurality of characters based on one or more attributes of the plurality of clusters, and selecting, from the set of frames, a frame of video content as representative of the media title based on one or more prominence scores for one or more characters included in the frame.
Public/Granted literature
Information query
Patent Agency Ranking
0/0