Patent search ap:("SRI International") AND inv:"Ajay Divakaran" Page 2

11.

发明申请
INSTRUCTION-GUIDED VISUAL EMBEDDINGS AND FEEDBACK-BASED LEARNING IN LARGE VISION-LANGUAGE MODELS 有权

公开(公告)号：US20250131027A1

公开(公告)日：2025-04-24

申请号：US18924763

申请日：2024-10-23

Applicant: SRI International

Inventor： Yangyi Chen , Karan Sikka , Michael A. Cogswell , Ajay Divakaran

IPC: G06F16/338 , G06F16/33 , G06F16/532

Abstract: In an example, a method for fine-tuning a Large Visual Language Model (LVLM) includes providing visual queries, each of the visual queries comprises at least an image and a textual query related to the image; processing, by the LVLM, the visual queries to extract visual embeddings from the visual queries, wherein the LVLM comprises a Visual Language Model (VLM), a first Large Language Model (LLM), and a linear projection layer interconnecting the VLM and the LLM; for visual queries: i) generating, by the LVLM, a response to the corresponding visual query based on the corresponding visual embedding; ii) evaluating, by a second LLM, the generated response to verify that the generated response satisfies predefined criteria; and iii) providing, by the second LLM, a feedback to the LVLM, in response to the evaluating the generated response; and fine-tuning the LVLM using aggregated feedback provided by the second LLM for the visual queries.

12.

发明申请
CAUSAL ANALYSIS WITH TIME SERIES DATA 有权

公开(公告)号：US20250110989A1

公开(公告)日：2025-04-03

申请号：US18895080

申请日：2024-09-24

Applicant: SRI International

Inventor： Ajay Divakaran , Yi Yao , Julia Kruk , Jesse Hostetler , Jihua Huang

IPC: G06F16/901

Abstract: In general, various aspects of the techniques are directed to causal analysis using large scale time series data. A computing system may convert large scale time series data to first time period records and second time period records according to a multi-scale time resolution. The computing system may implement a hierarchical machine learning model to generate embeddings that capture temporal characteristics of features of the large scale time series data. The computing system may generate a graph data structure indicating cause and effect correlations between features of the large scale time series data based on temporal dynamics captured in the cause and second time period records and/or the embeddings.

13.

发明公开
SPATIAL-TEMPORAL ANOMALY AND EVENT DETECTION USING NIGHT VISION SENSORS 审中-公开

公开(公告)号：US20240212350A1

公开(公告)日：2024-06-27

申请号：US18331007

申请日：2023-06-07

Applicant: SRI International

Inventor： Subhodev Das , Ajay Divakaran , Ali Chaudhry , Julia Kruk , Bo Dong

IPC: G06V20/40 , G06V10/44 , H04N23/21

CPC classification number: G06V20/44 , G06V10/44 , H04N23/21

Abstract: In general, the disclosure describes techniques for joint spatiotemporal Artificial Intelligence (AI) models that can encompass multiple space and time resolutions through self-supervised learning. In an example, a method includes for each of a plurality of multimodal data, generating, by a computing system, using a first machine learning model, a respective modality feature vector representative of content of the multimodal data, wherein each of the generated modality feature vectors has a different modality; processing, by the computing system, each of generated modality feature vectors with a second machine learning model comprising an encoder model to generate event data comprising a plurality of events and/or activities of interest; and analyzing, by the computing system, the event data to generate anomaly data indicative of detected anomalies in the multimodal data.

14.

发明申请
ZERO-SHOT OBJECT DETECTION 审中-公开

公开(公告)号：US20190325243A1

公开(公告)日：2019-10-24

申请号：US16383447

申请日：2019-04-12

Applicant: SRI International

Inventor： Karan Sikka , Ajay Divakaran , Ankan Bansal

IPC: G06K9/20 , G06K9/62 , G06K9/46 , G06K9/72 , G06K9/32

Abstract: A method, apparatus and system for zero shot object detection includes, in a semantic embedding space having embedded object class labels, training the space by embedding extracted features of bounding boxes and object class labels of labeled bounding boxes of known object classes into the space, determining regions in an image having unknown object classes on which to perform object detection as proposed bounding boxes, extracting features of the proposed bounding boxes, projecting the extracted features of the proposed bounding boxes into the space, computing a similarity measure between the projected features of the proposed bounding boxes and the embedded, extracted features of the bounding boxes of the known object classes in the space, and predicting an object class label for proposed bounding boxes by determining a nearest embedded object class label to the projected features of the proposed bounding boxes in the space based on the similarity measures.

15.

发明授权
Multi-modal modeling of temporal interaction sequences 有权

公开(公告)号：US09734730B2

公开(公告)日：2017-08-15

申请号：US13755775

申请日：2013-01-31

Applicant: SRI International

Inventor： Ajay Divakaran , Behjat Siddiquie , Saad Khan , Jeffrey Lubin , Harpreet S. Sawhney

IPC: G09B19/00

CPC classification number: G09B19/00

Abstract: A multi-modal interaction modeling system can model a number of different aspects of a human interaction across one or more temporal interaction sequences. Some versions of the system can generate assessments of the nature or quality of the interaction or portions thereof, which can be used to, among other things, provide assistance to one or more of the participants in the interaction.

16.

发明授权
Classification, search, and retrieval of complex video events 有权
Title translation: 复杂视频事件的分类，搜索和检索

公开(公告)号：US09244924B2

公开(公告)日：2016-01-26

申请号：US13737607

申请日：2013-01-09

Applicant: SRI INTERNATIONAL

Inventor： Hui Cheng , Harpreet Singh Sawhney , Ajay Divakaran , Qian Yu , Jingen Liu , Amir Tamrakar , Saad Ali , Omar Javed

IPC: G06F17/30

CPC classification number: G06F17/30823 , G06F17/30023 , G06F17/30784 , G06F17/30817

Abstract: A complex video event classification, search and retrieval system can generate a semantic representation of a video or of segments within the video, based on one or more complex events that are depicted in the video, without the need for manual tagging. The system can use the semantic representations to, among other things, provide enhanced video search and retrieval capabilities.

Abstract translation: 复杂的视频事件分类，搜索和检索系统可以基于视频中描绘的一个或多个复杂事件，而不需要手动标记来生成视频中的视频或片段的语义表示。该系统可以使用语义表示来提供增强的视频搜索和检索功能。

17.

发明申请
REAL-TIME OBJECT DETECTION, TRACKING AND OCCLUSION REASONING 有权
Title translation: 实时对象检测，跟踪和声明推理

公开(公告)号：US20140347475A1

公开(公告)日：2014-11-27

申请号：US14286305

申请日：2014-05-23

Applicant: SRI International

Inventor： Ajay Divakaran , Qian Yu , Amir Tamrakar , Harpreet Singh Sawhney , Jiejie Zhu , Omar Javed , Jingen Liu , Hui Cheng , Jayakrishnan Eledath

IPC: G06K9/00

CPC classification number: G06K9/00771

Abstract: A system for object detection and tracking includes technologies to, among other things, detect and track moving objects, such as pedestrians and/or vehicles, in a real-world environment, handle static and dynamic occlusions, and continue tracking moving objects across the fields of view of multiple different cameras.

Abstract translation: 用于物体检测和跟踪的系统包括在现实环境中检测和跟踪诸如行人和/或车辆之类的移动物体的技术，处理静态和动态遮挡，以及继续跟踪所有场中的移动物体的多个不同的相机的视图。

18.

发明授权
Method for pose invariant fingerprinting 有权
Title translation: 姿态不变指纹识别方法

公开(公告)号：US08860813B2

公开(公告)日：2014-10-14

申请号：US13711220

申请日：2012-12-11

Applicant: SRI International

Inventor： Sang-Hack Jung , Ajay Divakaran , Harpreet Singh Sawhney

IPC: H04N7/18

CPC classification number: G06K9/00771 , G06K9/6206 , G06K9/6211

Abstract: A computer-implemented method for matching objects is disclosed. At least two images where one of the at least two images has a first target object and a second of the at least two images has a second target object are received. At least one first patch from the first target object and at least one second patch from the second target object are extracted. A distance-based part encoding between each of the at least one first patch and the at least one second patch based upon a corresponding codebook of image parts including at least one of part type and pose is constructed. A viewpoint of one of the at least one first patch is warped to a viewpoint of the at least one second patch. A parts level similarity measure based on the view-invarient distance measure for each of the at least one first patch and the at least one second patch is applied to determine whether the first target object and the second target object are the same or different objects.

Abstract translation: 公开了一种用于匹配对象的计算机实现的方法。接收至少两个图像，其中至少两个图像中的一个具有第一目标对象，并且至少两个图像中的第二图像具有第二目标对象。提取来自第一目标对象的至少一个第一补丁和来自第二目标对象的至少一个第二补丁。构建基于包括部件类型和姿态中的至少一个的图像部件的对应码本的至少一个第一贴片和至少一个第二贴片中的每一个之间的基于距离的部件编码。所述至少一个第一贴片中的一个的视点弯曲到所述至少一个第二贴片的观点。应用基于对于至少一个第一贴片和至少一个第二贴片中的每一个的视野不变距离度量的零件级相似性度量来确定第一目标对象和第二目标对象是相同还是不同的对象。

19.

发明申请
MULTI-MODAL MODELING OF TEMPORAL INTERACTION SEQUENCES 有权

公开(公告)号：US20140212853A1

公开(公告)日：2014-07-31

申请号：US13755775

申请日：2013-01-31

Applicant: SRI International

Inventor： Ajay Divakaran , Behjat Siddiquie , Saad Khan , Jeffrey Lubin , Harpreet S. Sawhney

IPC: G09B19/00

CPC classification number: G09B19/00

Abstract: A multi-modal interaction modeling system can model a number of different aspects of a human interaction across one or more temporal interaction sequences. Some versions of the system can generate assessments of the nature or quality of the interaction or portions thereof, which can be used to, among other things, provide assistance to one or more of the participants in the interaction.

20.

发明申请
LARGE LANGUAGE MODEL AUGMENTATION WITH KNOWLEDGE LANGUAGE MODELS 有权

公开(公告)号：US20250131212A1

公开(公告)日：2025-04-24

申请号：US18919630

申请日：2024-10-18

Applicant: SRI International

Inventor： Pengfei Yu , Yi Yao , Karan Sikka , Michael A. Cogswell , Ajay Divakaran

IPC: G06F40/56

Abstract: In an example, a method for generating responses by a Machine Learning (ML) system includes processing, by a first language model, a natural language instruction to generate an instruction representation based on a meaning of the natural language instruction; translating, by a translation module comprising an interface between the first language model and a second language model, the instruction representation into data indicating an intent of the natural language instruction, wherein the second language model is trained with domain specific knowledge; providing, by the translation module, the natural language instruction and the data indicating the intent of the natural language instruction to the second language model; and generating, by the second language model, a response based on the natural language instruction and the data indicating the intent of the natural language instruction.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification