Neural network object pose determination
Abstract:
A camera is positioned to obtain an image of an object. The image is input to a neural network that outputs a three-dimensional (3D) bounding box for the object relative to a pixel coordinate system and object parameters. Then a center of a bottom face of the 3D bounding box is determined in pixel coordinates. The bottom face of the 3D bounding box is located in a ground plane in the image. Based on calibration parameters for the camera that transform pixel coordinates into real-world coordinates, a) a distance from the center of the bottom face of the 3D bounding box to the camera relative to a real-world coordinate system and b) an angle between a line extending from the camera to the center of the bottom face of the 3D bounding box and an optical axis of the camera are determined. The calibration parameters include a camera height relative to the ground plane, a camera focal distance, and a camera tilt relative to the ground plane. A six degree-of-freedom (6DoF) pose for the object is determined based on the object parameters, the distance, and the angle.
Public/Granted literature
Information query
Patent Agency Ranking
0/0