Intelligent image captioning

Invention Grant

US11593612B2 Intelligent image captioning 有权

Please log in to see more content

Patent Title: Intelligent image captioning
Application No.: US16544772

Application Date: 2019-08-19
Publication No.: US11593612B2

Publication Date: 2023-02-28
Inventor: Junhua Mao , Wei Xu , Yi Yang , Jiang Wang , Zhiheng Huang
Applicant: BAIDU USA LLC
Applicant Address: US CA Sunnyvale
Assignee: BAIDU USA LLC
Current Assignee: BAIDU USA LLC
Current Assignee Address: US CA Sunnyvale
Agency: North Weber & Baugh LLP
Main IPC: G06N3/04
IPC: G06N3/04

Abstract:

Presented herein are embodiments of a multimodal Recurrent Neural Network (m-RNN) model for generating novel image captions. In embodiments, it directly models the probability distribution of generating a word given a previous word or words and an image, and image captions are generated according to this distribution. In embodiments, the model comprises two sub-networks: a deep recurrent neural network for sentences and a deep convolutional network for images. In embodiments, these two sub-networks interact with each other in a multimodal layer to form the whole m-RNN model. The effectiveness of an embodiment of model was validated on four benchmark datasets, and it outperformed the state-of-the-art methods. In embodiments, the m-RNN model may also be applied to retrieval tasks for retrieving images or captions.

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑