-
公开(公告)号:US20240169563A1
公开(公告)日:2024-05-23
申请号:US18509627
申请日:2023-11-15
Applicant: NVIDIA Corporation
Inventor: Bowen Wen , Jonathan Tremblay , Valts Blukis , Jan Kautz , Stanley Thomas Birchfield
CPC classification number: G06T7/248 , G06T7/11 , G06T7/70 , G06T17/00 , G06T19/006 , G06T2207/10016 , G06T2207/10024 , G06T2207/10028 , G06T2207/20072 , G06T2207/20084 , G06T2207/30252
Abstract: Apparatuses, systems, and techniques for constructing a data structure to store a shape of an object based at least in part on a portion of multiple images, and obtaining poses of the object by tracking a pose of the object through the multiple images based at least in part on the data structure. Optionally, the poses may be used to generate a plan for a path of a device to travel, generate a rendering of at least a portion of a Mixed Reality (“MR”) display to be viewed by a user, and/or the like.
-
公开(公告)号:US11715251B2
公开(公告)日:2023-08-01
申请号:US17507620
申请日:2021-10-21
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Aayush Prakash , Mark A. Brophy , Varun Jampani , Cem Anil , Stanley Thomas Birchfield , Thang Hong To , David Jesus Acuna Marrero
IPC: G06T15/00 , G06T15/04 , G06T15/50 , G06T15/20 , G06F18/214 , G06F18/211 , G06V10/774 , G06V10/82 , G06N3/04 , G06N3/084
CPC classification number: G06T15/00 , G06F18/211 , G06F18/2148 , G06T15/04 , G06T15/20 , G06T15/50 , G06V10/7747 , G06V10/82 , G06N3/04 , G06N3/084 , G06T2210/12 , G06V2201/07
Abstract: Training deep neural networks requires a large amount of labeled training data. Conventionally, labeled training data is generated by gathering real images that are manually labelled which is very time-consuming. Instead of manually labelling a training dataset, domain randomization technique is used generate training data that is automatically labeled. The generated training data may be used to train neural networks for object detection and segmentation (labelling) tasks. In an embodiment, the generated training data includes synthetic input images generated by rendering three-dimensional (3D) objects of interest in a 3D scene. In an embodiment, the generated training data includes synthetic input images generated by rendering 3D objects of interest on a 2D background image. The 3D objects of interest are objects that a neural network is trained to detect and/or label.
-
公开(公告)号:US20210390653A1
公开(公告)日:2021-12-16
申请号:US17458221
申请日:2021-08-26
Applicant: Nvidia Corporation
Inventor: Jonathan Tremblay , Stan Birchfield , Stephen Tyree , Thang To , Jan Kautz , Artem Molchanov
Abstract: Various embodiments enable a robot, or other autonomous or semi-autonomous device or system, to receive data involving the performance of a task in the physical world. The data can be provided as input to a perception network to infer a set of percepts about the task, which can correspond to relationships between objects observed during the performance. The percepts can be provided as input to a plan generation network, which can infer a set of actions as part of a plan. Each action can correspond to one of the observed relationships. The plan can be reviewed and any corrections made, either manually or through another demonstration of the task. Once the plan is verified as correct, the plan (and any related data) can be provided as input to an execution network that can infer instructions to cause the robot, and/or another robot, to perform the task.
-
公开(公告)号:US11074717B2
公开(公告)日:2021-07-27
申请号:US16405662
申请日:2019-05-07
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Thang Hong To , Stanley Thomas Birchfield
Abstract: An object detection neural network receives an input image including an object and generates belief maps for vertices of a bounding volume that encloses the object. The belief maps are used, along with three-dimensional (3D) coordinates defining the bounding volume, to compute the pose of the object in 3D space during post-processing. When multiple objects are present in the image, the object detection neural network may also generate vector fields for the vertices. A vector field comprises vectors pointing from the vertex to a centroid of the object enclosed by the bounding volume defined by the vertex. The object detection neural network may be trained using images of computer-generated objects rendered in 3D scenes (e.g., photorealistic synthetic data). Automatically labelled training datasets may be easily constructed using the photorealistic synthetic data. The object detection neural network may be trained for object detection using only the photorealistic synthetic data.
-
公开(公告)号:US20200311855A1
公开(公告)日:2020-10-01
申请号:US16902097
申请日:2020-06-15
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: Pose estimation generally refers to a computer vision technique that determines the pose of some object, usually with respect to a particular camera. Pose estimation has many applications, but is particularly useful in the context of robotic manipulation systems. To date, robotic manipulation systems have required a camera to be installed on the robot itself (i.e. a camera-in-hand) for capturing images of the object and/or a camera external to the robot for capturing images of the object. Unfortunately, the camera-in-hand has a limited field of view for capturing objects, whereas the external camera, which may have a greater field of view, requires costly calibration each time the camera is even slightly moved. Similar issues apply when estimating the pose of any object with respect to another object (i.e. which may be moving or not). The present disclosure avoids these issues and provides object-to-object pose estimation from a single image.
-
公开(公告)号:US20200252600A1
公开(公告)日:2020-08-06
申请号:US16780738
申请日:2020-02-03
Applicant: NVIDIA Corporation
Inventor: Hung-Yu Tseng , Shalini De Mello , Jonathan Tremblay , Sifei Liu , Jan Kautz , Stanley Thomas Birchfield
IPC: H04N13/282 , H04N13/268 , G06N3/08 , G06K9/62
Abstract: When an image is projected from 3D, the viewpoint of objects in the image, relative to the camera, must be determined. Since the image itself will not have sufficient information to determine the viewpoint of the various objects in the image, techniques to estimate the viewpoint must be employed. To date, neural networks have been used to infer such viewpoint estimates on an object category basis, but must first be trained with numerous examples that have been manually created. The present disclosure provides a neural network that is trained to learn, from just a few example images, a unique viewpoint estimation network capable of inferring viewpoint estimations for a new object category.
-
公开(公告)号:US11941899B2
公开(公告)日:2024-03-26
申请号:US17331451
申请日:2021-05-26
Applicant: NVIDIA Corporation
Inventor: Jonathan Tremblay , Fabio Tozeto Ramos , Yuke Zhu , Anima Anandkumar , Guanya Shi
IPC: B25J9/16 , B25J13/08 , B25J19/02 , G05B13/02 , G06F18/214 , G06K9/00 , G06N3/04 , G06N3/045 , G06T7/73 , G06V10/75 , G06V20/64
CPC classification number: G06V20/653 , G06F18/2148 , G06N3/045 , G06V10/751
Abstract: Apparatuses, systems, and techniques generate poses of an object based on image data of the object obtained from a first viewpoint of the object and a second viewpoint of the object. The poses can be evaluated to determine a portion of the image data usable by an estimator to generate a pose of the object.
-
公开(公告)号:US20240095077A1
公开(公告)日:2024-03-21
申请号:US18122594
申请日:2023-03-16
Applicant: NVIDIA Corporation
Inventor: Ishika Singh , Arsalan Mousavian , Ankit Goyal , Danfei Xu , Jonathan Tremblay , Dieter Fox , Animesh Garg , Valts Blukis
CPC classification number: G06F9/5027 , G06N20/00
Abstract: Apparatuses, systems, and techniques to generate a prompt for one or more machine learning processes. In at least one embodiment, the machine learning process(es) generate(s) a plan to perform a task (identified in the prompt) that is to be performed by an agent (real world or virtual).
-
公开(公告)号:US11417063B2
公开(公告)日:2022-08-16
申请号:US17181946
申请日:2021-02-22
Applicant: NVIDIA Corporation
Inventor: Yunzhi Lin , Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
公开(公告)号:US20220068024A1
公开(公告)日:2022-03-03
申请号:US17181946
申请日:2021-02-22
Applicant: NVIDIA Corporation
Inventor: Yunzhi Lin , Jonathan Tremblay , Stephen Walter Tyree , Stanley Thomas Birchfield
Abstract: One or more images (e.g., images taken from one or more cameras) may be received, where each of the one or more images may depict a two-dimensional (2D) view of a three-dimensional (3D) scene. Additionally, the one or more images may be utilized to determine a three-dimensional (3D) representation of a scene. This representation may help an entity navigate an environment represented by the 3D scene.
-
-
-
-
-
-
-
-
-