Patent search ap:("NEC LABORATORIES AMERICA Page INC.") AND inv:"KADAV

1.

发明申请
CONTEXTUAL GROUNDING OF NATURAL LANGUAGE PHRASES IN IMAGES 审中-公开

公开(公告)号：WO2021050776A1

公开(公告)日：2021-03-18

申请号：PCT/US2020/050258

申请日：2020-09-10

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： LAI, Farley , KADAV, Asim , XIE, Ning

IPC: G06K9/20 , G06K9/48 , G06F16/53 , G06F16/33 , G06F40/284 , G06N3/02

Abstract: Aspects of the present disclosure describe systems, methods and structures providing contextual grounding - a higher-order interaction technique to capture corresponding context between text entities and visual objects.

2.

发明申请
COMPOSITIONAL REASONING OF GROUP ACTIVITY IN VIDEOS WITH KEYPOINT-ONLY MODALITY 审中-公开

公开(公告)号：WO2023080982A1

公开(公告)日：2023-05-11

申请号：PCT/US2022/045836

申请日：2022-10-06

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： KADAV, Asim , LAI, Farley , GRAF, Hans Peter , ZHOU, Honglu

IPC: G06V10/762 , G06V20/52 , G06V10/46 , G06V20/40 , H04N21/44 , G06N5/04 , G06N20/00

Abstract: A method for compositional reasoning of group activity in videos with keypoint-only modality is presented. The method includes obtaining video frames from a video stream received from a plurality of video image capturing devices, extracting keypoints all of persons detected in the video frames to define keypoint data, tokenizing the keypoint data with time and segment information, clustering groups of keypoint persons in the video frames and passing the clustering groups through multi-scale prediction, and performing a prediction to provide a group activity prediction of a scene in the video frames.

3.

发明申请
RULE ENABLED COMPOSITIONAL REASONING SYSTEM 审中-公开

公开(公告)号：WO2022060574A1

公开(公告)日：2022-03-24

申请号：PCT/US2021/048811

申请日：2021-09-02

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： LAI, Farley , KADAV, Asim , PRASAD, Anupriya

IPC: G06N5/04 , G06N20/00

Abstract: A computer-implemented method is provided for compositional reasoning. The method includes producing (320) a set of primitive predictions from an input sequence. Each of the primitive predictions is of a single action of a tracked subject to be composed in a complex action comprising multiple single actions. The method further includes performing (330) contextual rule filtering of the primitive predictions to pass through filtered primitive predictions that interact with one or more entities of interest in the input sequence with respect to predefined contextual interaction criteria. The method includes performing (340), by a processor device, temporal rule matching by matching the filtered primitive predictions according to pre-defined temporal rules to identify complex event patterns in the sequence of primitive predictions.

4.

发明申请
ACTION RECOGNITION WITH HIGH-ORDER INTERACTION THROUGH SPATIAL-TEMPORAL OBJECT TRACKING 审中-公开

公开(公告)号：WO2021050772A1

公开(公告)日：2021-03-18

申请号：PCT/US2020/050254

申请日：2020-09-10

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： LAI, Farley , KADAV, Asim , CHEN, Jie

IPC: G06K9/00 , G06N3/04

Abstract: Aspects of the present disclosure describe systems, methods, and structures that provide action recognition with high-order interaction with spatio-temporal object tracking. Image and object features are organized into tracks, which advantageously facilitates many possible learnable embeddings and intra/inter-track interaction(s). Operationally, our systems, method, and structures according to the present disclosure employ an efficient high-order interaction model to learn embeddings and intra/inter object track interaction across the space and time for AR. Each frame is detected by an object detector to locate visual objects. Those objects are linked through time to form object tracks. The object tracks are then organized and combined with the embeddings as the input to our model. The model is trained to generate representative embeddings and discriminative video features through high-order interaction which is formulated as an efficient matrix operation without iterative processing delay.

5.

发明申请
SELF-SUPERVISED SEQUENTIAL VARIATIONAL AUTOENCODER FOR DISENTANGLED DATA GENERATION 审中-公开

公开(公告)号：WO2021096739A1

公开(公告)日：2021-05-20

申请号：PCT/US2020/058857

申请日：2020-11-04

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： MIN, Renqiang , ZHU, Yizhe , KADAV, Asim , GRAF, Hans, Peter

IPC: G06N3/08 , G06N3/04 , G06N3/063

Abstract: A computer-implemented method is provided for disentangled data generation. The method includes accessing (410), by a variational autoencoder, a plurality of supervision signals. The method further includes accessing (420), by the variational autoencoder, a plurality of auxiliary tasks that utilize the supervision signals as reward signals to learn a disentangled representation. The method also includes training (430) the variational autoencoder to disentangle a sequential data input into a time-invariant factor and a time- varying factor using a self-supervised training approach which is based on outputs of the auxiliary tasks obtained by using the supervision signals to accomplish the plurality of auxiliary tasks.

6.

发明申请
SPATIO-TEMPORAL INTERACTIONS FOR VIDEO UNDERSTANDING 审中-公开

公开(公告)号：WO2021050769A1

公开(公告)日：2021-03-18

申请号：PCT/US2020/050251

申请日：2020-09-10

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： KADAV, Asim , LAI, Farley , SHARMA, Chhavi

IPC: G06N3/08 , G06N3/04

Abstract: Aspects of the present disclosure describe systems, methods and structures including a network that recognizes action(s) from learned relationship(s) between various objects in video(s). Interaction(s) of objects over space and time is learned from a series of frames of the video. Object-like representations are learned directly from various 2D CNN layers by capturing the 2D CNN channels, resizing them to an appropriate dimension and then providing them to a transformer network that learns higher-order relationship(s) between them. To effectively learn object-like representations, we 1) combine channels from a first and last convolutional layer in the 2D CNN, and 2) optionally cluster the channel (feature map) representations so that channels representing the same object type are grouped together.

7.

发明申请
SPATIO-TEMPORAL INTERACTION NETWORK FOR LEARNING OBJECT INTERACTIONS 审中-公开

公开(公告)号：WO2019013913A1

公开(公告)日：2019-01-17

申请号：PCT/US2018/036814

申请日：2018-06-11

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： KADAV, Asim , MA, Chih-Yao , MELVIN, Iain , GRAF, Hans-Peter

IPC: G06N99/00

Abstract: Systems and methods for improving video understanding tasks based on higher-order object interactions (HOIs) between object features are provided. A plurality of frames of a video are obtained. A coarse-grained feature representation is generated by generating an image feature for each of for each of a plurality of timesteps respectively corresponding to each of the frames and performing attention based on the image features. A fine-grained feature representation is generated by generating an object feature for each of the plurality of timesteps and generating the HOIs between the object features. The coarse-grained and the fine-grained feature representations are concatenated to generate a concatenated feature representation.

8.

发明申请
SEMI-AUTOMATIC DATA COLLECTION AND ASSOCIATION FOR MULTI-CAMERA TRACKING 审中-公开

公开(公告)号：WO2022250970A1

公开(公告)日：2022-12-01

申请号：PCT/US2022/028945

申请日：2022-05-12

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： LAI, Farley , KADAV, Asim , LAKSHMINARAYANAN, Likitha

IPC: G06V20/52 , H04N7/18 , G06V10/62 , G06V10/25 , G06N20/00

Abstract: A surveillance system is provided. The surveillance system includes a processor device (110) configured for (i) detecting and tracking persons locally for each camera input video stream using the common area anchor boxes and assigning each detected ones of the persons a local track id, (ii) associating a same person in overlapping camera views to a global track id, and. collecting associated track boxes as the same person moves in different camera views over time using a priority queue and the local track id and the global track id, (iii) performing track data collection to derive a spatial transformation through matched track box spatial features of a same person over time for scene coverage and (iv) learning a multi -camera tracker given visual features from matched track boxes of distinct people across cameras based on the derived spatial transformation.

9.

发明申请
MULTI-HOP TRANSFORMER FOR SPATIO-TEMPORAL REASONING AND LOCALIZATION 审中-公开

公开(公告)号：WO2022066388A1

公开(公告)日：2022-03-31

申请号：PCT/US2021/048832

申请日：2021-09-02

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： KADAV, Asim , LAI, Farley , GRAF, Hans, Peter , NICULESCU-MIZIL, Alexandru , MIN, Renqiang , ZHOU, Honglu

IPC: G06N3/08 , G06N3/04 , G06N5/04

Abstract: A method for using a multi-hop reasoning framework to perform multi-step compositional long-term reasoning is presented. The method includes extracting (1001) feature maps and frame-level representations from a video stream by using a convolutional neural network (CNN), performing (1003) object representation learning and detection, linking (1005) objects through time via tracking to generate object tracks and image feature tracks, feeding (1007) the object tracks and the image feature tracks to a multi-hop transformer that hops over frames in the video stream while concurrently attending to one or more of the objects in the video stream until the multi-hop transformer arrives at a correct answer, and employing (1009) video representation learning and recognition from the objects and image context to locate a target object within the video stream.

10.

发明申请
KEYPOINT BASED POSE-TRACKING USING ENTAILMENT 审中-公开

公开(公告)号：WO2021050773A1

公开(公告)日：2021-03-18

申请号：PCT/US2020/050255

申请日：2020-09-10

Applicant: NEC LABORATORIES AMERICA, INC.

Inventor： KADAV, Asim , LAI, Farley , GRAF, Hans Peter , SNOWER, Michael

IPC: G06T7/246 , G06T7/269

Abstract: Aspects of the present disclosure describe systems, methods and structures for an efficient multi-person posetracking method that advantageously achieves state-of-the-art performance on PoseTrack datasets by only using keypoint information in a tracking step without optical flow or convolution routines. As a consequence, our method has fewer parameters and FLOPs and achieves faster FPS. Our method benefits from our parameter-free tracking method that outperforms commonly used bounding box propagation in top-down methods. Finally, we disclose tokenization and embedding multi-person pose keypoint information in the transformer architecture that can be reused for other pose tasks such as pose-based action recognition

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification