-
公开(公告)号:US20240087151A1
公开(公告)日:2024-03-14
申请号:US17903712
申请日:2022-09-06
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Vitor GUIZILINI
CPC classification number: G06T7/55 , B60W50/06 , B60W2420/42 , G06T2207/10016 , G06T2207/20081 , G06T2207/20084 , G06T2207/30252
Abstract: A method for controlling a vehicle in an environment includes generating, via a cross-attention model, a cross-attention cost volume based on a current image of the environment and a previous image of the environment in a sequence of images. The method also includes generating combined features by combining cost volume features of the cross-attention cost volume with single-frame features associated with the current image. The single-frame features may be generated via a single-frame encoding model. The method further includes generating a depth estimate of the current image based on the combined features. The method still further includes controlling an action of the vehicle based on the depth estimate.
-
公开(公告)号:US20230351767A1
公开(公告)日:2023-11-02
申请号:US17732421
申请日:2022-04-28
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Arjun BHARGAVA , Chao FANG , Charles Christopher OCHOA , Kun-Hsin CHEN , Kuan-Hui LEE , Vitor GUIZILINI
CPC classification number: G06V20/58 , B60W60/001 , G06V20/49 , B60W2420/42 , B60W2420/52
Abstract: A method for generating a dense light detection and ranging (LiDAR) representation by a vision system includes receiving, at a sparse depth network, one or more sparse representations of an environment. The method also includes generating a depth estimate of the environment depicted in an image captured by an image capturing sensor. The method further includes generating, via the sparse depth network, one or more sparse depth estimates based on receiving the one or more sparse representations. The method also includes fusing the depth estimate and the one or more sparse depth estimates to generate a dense depth estimate. The method further includes generating the dense LiDAR representation based on the dense depth estimate and controlling an action of the vehicle based on identifying a three-dimensional object in the dense LiDAR representation.
-
公开(公告)号:US20230177850A1
公开(公告)日:2023-06-08
申请号:US17543144
申请日:2021-12-06
Applicant: TOYOTA RESEARCH INSTITUTE, INC. , THE BOARD OF TRUSTEES OF THE LELAND STANFORD JUNIOR UNIVERSITY
Inventor: Rares Andrei AMBRUS , Or LITANY , Vitor GUIZILINI , Leonidas GUIBAS , Adrien David GAIDON , Jie LI
CPC classification number: G06K9/00201 , G06K9/00791 , G06T7/20 , G06N3/08 , G06T2207/30241
Abstract: A method for 3D object detection is described. The method includes predicting, using a trained monocular depth network, an estimated monocular input depth map of a monocular image of a video stream and an estimated depth uncertainty map associated with the estimated monocular input depth map. The method also includes feeding back a depth uncertainty regression loss associated with the estimated monocular input depth map during training of the trained monocular depth network to update the estimated monocular input depth map. The method further includes detecting 3D objects from a 3D point cloud computed from the estimated monocular input depth map based on seed positions selected from the 3D point cloud and the estimated depth uncertainty map. The method also includes selecting 3D bounding boxes of the 3D objects detected from the 3D point cloud based on the seed positions and an aggregated depth uncertainty.
-
公开(公告)号:US20210397855A1
公开(公告)日:2021-12-23
申请号:US16909907
申请日:2020-06-23
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Vitor GUIZILINI , Adrien David GAIDON
Abstract: A method includes capturing a two-dimensional (2D) image of an environment adjacent to an ego vehicle, the environment includes at least a dynamic object and a static object. The method also includes generating, via a depth estimation network, a depth map of the environment based on the 2D image, an accuracy of a depth estimate for the dynamic object in the depth map is greater than an accuracy of a depth estimate for the static object in the depth map. The method further includes generating a three-dimensional (3D) estimate of the environment based on the depth map and identifying a location of the dynamic object in the 3D estimate. The method additionally includes controlling an action of the ego vehicle based on the identified location.
-
公开(公告)号:US20210237764A1
公开(公告)日:2021-08-05
申请号:US17093360
申请日:2020-11-09
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Jiexiong TANG , Rares A. AMBRUS , Vitor GUIZILINI , Sudeep PILLAI , Hanme KIM , Adrien David GAIDON
Abstract: A method for learning depth-aware keypoints and associated descriptors from monocular video for ego-motion estimation is described. The method includes training a keypoint network and a depth network to learn depth-aware keypoints and the associated descriptors. The training is based on a target image and a context image from successive images of the monocular video. The method also includes lifting 2D keypoints from the target image to learn 3D keypoints based on a learned depth map from the depth network. The method further includes estimating ego-motion from the target image to the context image based on the learned 3D keypoints.
-
公开(公告)号:US20240046655A1
公开(公告)日:2024-02-08
申请号:US18489687
申请日:2023-10-18
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Jiexiong TANG , Rares Andrei AMBRUS , Vitor GUIZILINI , Adrien David GAIDON
CPC classification number: G06V20/56 , G05D1/0246 , G05D1/0221 , G06V10/751
Abstract: A method for keypoint matching performed by a semantically aware keypoint matching model includes generating a semanticly segmented image from an image captured by a sensor of an agent, the semanticly segmented image associating a respective semantic label with each pixel of a group of pixels associated with the image. The method also includes generating a set of augmented keypoint descriptors by augmenting, for each keypoint of the set of keypoints associated with the image, a keypoint descriptor with semantic information associated with one or more pixels, of the semantically segmented image, corresponding to the keypoint. The method further includes controlling an action of the agent in accordance with identifying a target image having one or more first augmented keypoint descriptors that match one or more second augmented keypoint descriptors of the set of augmented keypoint descriptors.
-
公开(公告)号:US20230342960A1
公开(公告)日:2023-10-26
申请号:US18344700
申请日:2023-06-29
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Jiexiong TANG , Rares Andrei AMBRUS , Vitor GUIZILINI , Adrien David GAIDON
CPC classification number: G06T7/55 , G06T3/0093 , G06T3/4007 , G06T7/70 , G06T2207/30244
Abstract: A method for depth estimation performed by a depth estimation system associated with an agent includes determining a first depth of a first image and a second depth of a second image, the first image and the second image being captured by a sensor associated with the agent. The method also includes generating a first 3D image of the first image based on the first depth, a first pose associated with the sensor, and the second image. The method further includes generating a warped depth image based on transforming the first depth in accordance with the first pose. The method also includes updating the first pose based on a second pose associated with the warped depth image and the second depth, and updating the first 3D image based on the updated first pose. The method further includes controlling an action of the agent based on the updated first 3D image.
-
8.
公开(公告)号:US20220301212A1
公开(公告)日:2022-09-22
申请号:US17385358
申请日:2021-07-26
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Vitor GUIZILINI , Rares Andrei AMBRUS , Adrien David GAIDON , Igor VASILJEVIC , Gregory SHAKHNAROVICH
Abstract: A method for self-supervised depth and ego-motion estimation is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes generating a self-occlusion mask by manually segmenting self-occluded areas of images captured by the multi-camera rig of the ego vehicle. The method further includes multiplying the multi-camera photometric loss with the self-occlusion mask to form a self-occlusion masked photometric loss. The method also includes training a depth estimation model and an ego-motion estimation model according to the self-occlusion masked photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the depth estimation model and the ego-motion estimation model.
-
9.
公开(公告)号:US20220301206A1
公开(公告)日:2022-09-22
申请号:US17377684
申请日:2021-07-16
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Vitor GUIZILINI , Rares Andrei AMBRUS , Adrien David GAIDON , Igor VASILJEVIC , Gregory SHAKHNAROVICH
Abstract: A method for multi-camera monocular depth estimation using pose averaging is described. The method includes determining a multi-camera photometric loss associated with a multi-camera rig of an ego vehicle. The method also includes determining a multi-camera pose consistency constraint (PCC) loss associated with the multi-camera rig of the ego vehicle. The method further includes adjusting the multi-camera photometric loss according to the multi-camera PCC loss to form a multi-camera PCC photometric loss. The method also includes training a multi-camera depth estimation model and an ego-motion estimation model according to the multi-camera PCC photometric loss. The method further includes predicting a 360° point cloud of a scene surrounding the ego vehicle according to the trained multi-camera depth estimation model and the ego-motion estimation model.
-
公开(公告)号:US20220026918A1
公开(公告)日:2022-01-27
申请号:US16937470
申请日:2020-07-23
Applicant: TOYOTA RESEARCH INSTITUTE, INC.
Inventor: Vitor GUIZILINI , Jie LI , Rares A. AMBRUS , Sudeep PILLAI , Adrien GAIDON
Abstract: A method for controlling an ego agent includes capturing a two-dimensional (2D) image of an environment adjacent to the ego agent. The method also includes generating a semantically segmented image of the environment based on the 2D image. The method further includes generating a depth map of the environment based on the semantically segmented image. The method additionally includes generating a three-dimensional (3D) estimate of the environment based on the depth map. The method also includes controlling an action of the ego agent based on the identified location.
-
-
-
-
-
-
-
-
-