VIDEO CAPTURING DEVICE FOR PREDICTING SPECIAL DRIVING SITUATIONS
    1.
    发明申请
    VIDEO CAPTURING DEVICE FOR PREDICTING SPECIAL DRIVING SITUATIONS 审中-公开
    用于预测特殊驾驶情况的视频捕捉设备

    公开(公告)号:WO2017177008A1

    公开(公告)日:2017-10-12

    申请号:PCT/US2017/026365

    申请日:2017-04-06

    Abstract: A video device for predicting driving situations while a person drives a car is presented. The video device includes multi-modal sensors and knowledge data for extracting feature maps, a deep neural network trained with training data to recognize real-time traffic scenes (TSs) from a viewpoint of the car, and a user interface (UI) for displaying the real-time TSs. The real-time TSs are compared to predetermined TSs to predict the driving situations. The video device can be a video camera. The video camera can be mounted to a windshield of the car. Alternatively, the video camera can be incorporated into the dashboard or console area of the car. The video camera can calculate speed, velocity, type, and/or position information related to other cars within the real-time TS. The video camera can also include warning indicators, such as light emitting diodes (LEDs) that emit different colors for the different driving situations.

    Abstract translation: 提出了一种用于预测人在驾驶汽车时的驾驶状况的视频装置。 视频设备包括用于提取特征地图的多模态传感器和知识数据,利用训练数据训练的深度神经网络,以从汽车的角度识别实时交通场景(TS),以及用于显示的用户界面(UI) 实时TS。 实时TS与预定的TS进行比较以预测驾驶情况。 视频设备可以是摄像机。 摄像机可以安装在汽车的挡风玻璃上。 或者,摄像机可以集成到汽车的仪表板或控制台区域。 摄像机可以在实时TS内计算与其他车辆有关的速度,速度,类型和/或位置信息。 摄像机还可以包括警告指示器,例如发光二极管(LED),它们针对不同的驾驶情况发出不同的颜色。

    MULTI-MODAL DRIVING DANGER PREDICTION SYSTEM FOR AUTOMOBILES
    2.
    发明申请
    MULTI-MODAL DRIVING DANGER PREDICTION SYSTEM FOR AUTOMOBILES 审中-公开
    汽车多模态驱动危险预测系统

    公开(公告)号:WO2017177005A1

    公开(公告)日:2017-10-12

    申请号:PCT/US2017/026362

    申请日:2017-04-06

    Abstract: A computer-implemented method for training a deep neural network to recognize traffic scenes (TSs) from multi-modal sensors and knowledge data is presented. The computer-implemented method includes receiving data from the multi-modal sensors and the knowledge data and extracting feature maps from the multi-modal sensors and the knowledge data by using a traffic participant (TS) extractor to generate a first set of data, using a static objects extractor to generate a second set of data, and using an additional information extractor. The computer-implemented method further includes training the deep neural network, with training data, to recognize the TSs from a viewpoint of a vehicle.

    Abstract translation: 提出了一种用于训练深度神经网络以识别来自多模式传感器和知识数据的交通场景(TS)的计算机实现的方法。 该计算机实现的方法包括通过使用流量参与者(TS)提取器从多模式传感器和知识数据接收数据并且从多模式传感器和知识数据中提取特征映射以生成第一组数据 一个静态对象提取器来生成第二组数据,并使用一个附加信息提取器。 计算机实现的方法还包括训练具有训练数据的深度神经网络以从车辆的角度识别TS。

    SPATIO-TEMPORAL INTERACTION NETWORK FOR LEARNING OBJECT INTERACTIONS

    公开(公告)号:WO2019013913A1

    公开(公告)日:2019-01-17

    申请号:PCT/US2018/036814

    申请日:2018-06-11

    Abstract: Systems and methods for improving video understanding tasks based on higher-order object interactions (HOIs) between object features are provided. A plurality of frames of a video are obtained. A coarse-grained feature representation is generated by generating an image feature for each of for each of a plurality of timesteps respectively corresponding to each of the frames and performing attention based on the image features. A fine-grained feature representation is generated by generating an object feature for each of the plurality of timesteps and generating the HOIs between the object features. The coarse-grained and the fine-grained feature representations are concatenated to generate a concatenated feature representation.

    HIERARCHICAL WORD EMBEDDING SYSTEM
    5.
    发明申请

    公开(公告)号:WO2022216935A1

    公开(公告)日:2022-10-13

    申请号:PCT/US2022/023840

    申请日:2022-04-07

    Abstract: Systems and methods for matching job descriptions with job applicants is provided. The method includes allocating each of one or more job applicants curriculum vitae (CV) into sections 320; applying max pooled word embedding 330 to each section of the job applicants CVs; using concatenated max-pooling and average-pooling 340 to compose the section embeddings into an applicants CV representation; allocating each of one or more job position descriptions into specified sections 220; applying max pooled word embedding 230 to each section of the job position descriptions; using concatenated max-pooling and average-pooling 240 to compose the section embeddings into a job representation; calculating a cosine similarity 250, 350 between each of the job representations and each of the CV representations to perform job-to-applicant matching; and presenting an ordered list of the one or more job applicants 360 or an ordered list of the one or more job position descriptions 260 to a user.

    EFFICIENT AND FINE-GRAINED VIDEO RETRIEVAL
    6.
    发明申请

    公开(公告)号:WO2020197853A1

    公开(公告)日:2020-10-01

    申请号:PCT/US2020/023136

    申请日:2020-03-17

    Abstract: A computer-implemented method for performing mini-batching in deep learning by improving cache utilization is presented. The method includes temporally localizing a candidate clip (114) in a video stream (105) based on a natural language query (112), encoding a state, via a state processing module (120), into a joint visual and linguistic representation, feeding the joint visual and linguistic representation into a policy learning module (150), wherein the policy learning module employs a deep learning network to selectively extract features for select frames for video-text analysis and includes a fully connected linear layer (152) and a long short-term memory (LSTM) (154), outputting a value function (156) from the LSTM, generating an action policy based on the encoded state, wherein the action policy is a probabilistic distribution over a plurality of possible actions given the encoded state, and rewarding policy actions that return clips matching the natural language query.

Patent Agency Ranking