Joint heterogeneous language-vision embeddings for video tagging and search

Invention Grant

US11409791B2 Joint heterogeneous language-vision embeddings for video tagging and search 有权

Please log in to see more content

Patent Title: Joint heterogeneous language-vision embeddings for video tagging and search
Application No.: US15620232

Application Date: 2017-06-12
Publication No.: US11409791B2

Publication Date: 2022-08-09
Inventor: Atousa Torabi , Leonid Sigal
Applicant: Disney Enterprises, Inc.
Applicant Address: US CA Burbank
Assignee: Disney Enterprises, Inc.
Current Assignee: Disney Enterprises, Inc.
Current Assignee Address: US CA Burbank
Agency: Patterson + Sheridan, LLP
Main IPC: G06F16/638
IPC: G06F16/638 ; G06N3/08 ; G06N3/04 ; H04N21/8405 ; G06F16/783 ; G06V20/40

Joint heterogeneous language-vision embeddings for video tagging and search

Abstract:

Systems, methods and articles of manufacture for modeling a joint language-visual space. A textual query to be evaluated relative to a video library is received from a requesting entity. The video library contains a plurality of instances of video content. One or more instances of video content from the video library that correspond to the textual query are determined, by analyzing the textual query using a data model that includes a soft-attention neural network module that is jointly trained with a language Long Short-term Memory (LSTM) neural network module and a video LSTM neural network module. At least an indication of the one or more instances of video content is returned to the requesting entity.

Public/Granted literature

US20170357720A1 JOINT HETEROGENEOUS LANGUAGE-VISION EMBEDDINGS FOR VIDEO TAGGING AND SEARCH Public/Granted day:2017-12-14

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/60	.•音频数据
G06F16/63	..••查询
G06F16/638	...•••查询结果的表示