-
公开(公告)号:US11790644B2
公开(公告)日:2023-10-17
申请号:US17569725
申请日:2022-01-06
Applicant: INTEL CORPORATION
Inventor: Yurong Chen , Jianguo Li , Zhou Su , Zhiqiang Shen
IPC: G06V10/00 , G06V10/82 , G06F40/169 , G06N3/08 , G06V20/40 , G06F18/214 , G06V30/19 , G06V30/194 , G06V20/70 , G06V20/10
CPC classification number: G06V10/82 , G06F18/2155 , G06F40/169 , G06N3/08 , G06V20/10 , G06V20/41 , G06V20/46 , G06V20/47 , G06V20/70 , G06V30/194 , G06V30/19173
Abstract: Techniques and apparatus for generating dense natural language descriptions for video content are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to receive a source video comprising a plurality of frames, determine a plurality of regions for each of the plurality of frames, generate at least one region-sequence connecting the determined plurality of regions, apply a language model to the at least one region-sequence to generate description information comprising a description of at least a portion of content of the source video. Other embodiments are described and claimed.
-
公开(公告)号:US20210109956A1
公开(公告)日:2021-04-15
申请号:US16650853
申请日:2018-01-30
Applicant: INTEL CORPORATION
Inventor: Zhou Su , Jianguo Li , Yinpeng Dong , Yurong Chen
IPC: G06F16/332 , G06N3/04 , G06N5/02 , G06K9/32
Abstract: An example apparatus for visual question answering includes a receiver to receive an input image and a question. The apparatus also includes an encoder to encode the input image and the question into a query representation including visual attention features. The apparatus includes a knowledge spotter to retrieve a knowledge entry from a visual knowledge base pre-built on a set of question-answer pairs. The apparatus further includes a joint embedder to jointly embed the visual attention features and the knowledge entry to generate visual-knowledge features. The apparatus also further includes an answer generator to generate an answer based on the query representation and the visual-knowledge features.
-
公开(公告)号:US11663249B2
公开(公告)日:2023-05-30
申请号:US16650853
申请日:2018-01-30
Applicant: INTEL CORPORATION
Inventor: Zhou Su , Jianguo Li , Yinpeng Dong , Yurong Chen
IPC: G06F16/33 , G06F16/332 , G06N3/049 , G06N5/025 , G06N3/045
CPC classification number: G06F16/3329 , G06N3/045 , G06N3/049 , G06N5/025
Abstract: An example apparatus for visual question answering includes a receiver to receive an input image and a question. The apparatus also includes an encoder to encode the input image and the question into a query representation including visual attention features. The apparatus includes a knowledge spotter to retrieve a knowledge entry from a visual knowledge base pre-built on a set of question-answer pairs. The apparatus further includes a joint embedder to jointly embed the visual attention features and the knowledge entry to generate visual-knowledge features. The apparatus also further includes an answer generator to generate an answer based on the query representation and the visual-knowledge features.
-
公开(公告)号:US11341368B2
公开(公告)日:2022-05-24
申请号:US16475079
申请日:2017-04-07
Applicant: INTEL CORPORATION
Inventor: Anbang Yao , Shandong Wang , Wenhua Cheng , Dongqi Cai , Libin Wang , Lin Xu , Ping Hu , Yiwen Guo , Liu Yang , Yuqing Hou , Zhou Su , Yurong Chen
Abstract: Methods and systems for advanced and augmented training of deep neural networks (DNNs) using synthetic data and innovative generative networks. A method includes training a DNN using synthetic data, training a plurality of DNNs using context data, associating features of the DNNs trained using context data with features of the DNN trained with synthetic data, and generating an augmented DNN using the associated features.
-
公开(公告)号:US11042782B2
公开(公告)日:2021-06-22
申请号:US16473898
申请日:2017-03-20
Applicant: INTEL CORPORATION
Inventor: Zhou Su , Jianguo Li , Anbang Yao , Yurong Chen
Abstract: Techniques are provided for training and operation of a topic-guided image captioning system. A methodology implementing the techniques according to an embodiment includes generating image feature vectors, for an image to be captioned, based on application of a convolutional neural network (CNN) to the image. The method further includes generating the caption based on application of a recurrent neural network (RNN) to the image feature vectors. The RNN is configured as a long short-term memory (LSTM) RNN. The method further includes training the LSTM RNN with training images and associated training captions. The training is based on a combination of: feature vectors of the training image; feature vectors of the associated training caption; and a multimodal compact bilinear (MCB) pooling of the training caption feature vectors and an estimated topic of the training image. The estimated topic is generated by an application of the CNN to the training image.
-
公开(公告)号:US11263489B2
公开(公告)日:2022-03-01
申请号:US16616533
申请日:2017-06-29
Applicant: INTEL CORPORATION
Inventor: Yurong Chen , Jianguo Li , Zhou Su , Zhiqiang Shen
IPC: G06K9/00 , G06K9/62 , G06F40/169 , G06N3/08
Abstract: Techniques and apparatus for generating dense natural language descriptions for video content are described. In one embodiment, for example, an apparatus may include at least one memory and logic, at least a portion of the logic comprised in hardware coupled to the at least one memory, the logic to receive a source video comprising a plurality of frames, determine a plurality of regions for each of the plurality of frames, generate at least one region-sequence connecting the determined plurality of regions, apply a language model to the at least one region-sequence to generate description information comprising a description of at least a portion of content of the source video. Other embodiments are described and claimed.
-
公开(公告)号:US11176632B2
公开(公告)日:2021-11-16
申请号:US16474540
申请日:2017-04-07
Applicant: INTEL CORPORATION
Inventor: Anbang Yao , Dongqi Cai , Libin Wang , Lin Xu , Ping Hu , Shandong Wang , Wenhua Cheng , Yiwen Guo , Liu Yang , Yuqing Hou , Zhou Su
Abstract: Described herein are advanced artificial intelligence agents for modeling physical interactions. An apparatus to provide an active artificial intelligence (AI) agent includes at least one database to store physical interaction data and compute cluster coupled to the at least one database. The compute cluster automatically obtains physical interaction data from a data collection module without manual interaction, stores the physical interaction data in the at least one database, and automatically trains diverse sets of machine learning program units to simulate physical interactions with each individual program unit having a different model based on the applied physical interaction data.
-
公开(公告)号:US20210201078A1
公开(公告)日:2021-07-01
申请号:US16475079
申请日:2017-04-07
Applicant: INTEL CORPORATION
Inventor: Anbang Yao , Shandong Wang , Wenhua Cheng , Dongqi Cai , Libin Wang , Lin Xu , Ping Hu , Yiwen Guo , Liu Yang , Yuging Hou , Zhou Su , Yurong Chen
Abstract: Methods and systems for advanced and augmented training of deep neural networks (DNNs) using synthetic data and innovative generative networks. A method includes training a DNN using synthetic data, training a plurality of DNNs using context data, associating features of the DNNs trained using context data with features of the DNN trained with synthetic data, and generating an augmented DNN using the associated features.
-
-
-
-
-
-
-