Abstract:
Techniques and systems are provided for generating a background picture. The background picture can be used for coding one or more pictures. For example, a method of generating a background picture includes generating a long-term background model for one or more pixels of a background picture. The long-term background model includes a statistical model for detecting long-term motion of the one or more pixels in a sequence of pictures. The method further includes generating a short-term background model for the one or more pixels of the background picture. The short-term background model detects short-term motion of the one or more pixels between two or more pictures. The method further includes determining a value for the one or more pixels of the background picture using the long-term background model and the short-term background model.
Abstract:
Disruptions in the continuity of image frames output from an image capture device due to switching from one image sensor to another image sensor of the device may be reduced or eliminated through controlled timing for switching of the image sensors according to a predefined image sensor configuration. Operation of a multi-sensor image device according to the predefined image sensor configuration may include an appropriate selection of a source for image adjustment during the zoom level transition. The predefined image sensor configuration may define transition parameters for particular zoom ranges of the image capture device.
Abstract:
A method performed by an electronic device is described. The method includes obtaining a first frame of a scene. The method also includes performing object recognition of at least one object within a first bounding region of the first frame. The method further includes performing object tracking of the at least one object within the first bounding region of the first frame. The method additionally includes determining a second bounding region of a second frame based on the object tracking. The second frame is subsequent to the first frame. The method also includes determining whether the second bounding region is valid based on a predetermined object model.
Abstract:
Artefacts in a sequence of image frames may be reduced or eliminated through modification of an input image frame to match another image frame in the sequence, such as by geometrically warping to generate a corrected image frame with a field of view matched to another frame in sequence of frames with the image frame. The warping may be performed based on a model generated from data regarding the multi-sensor device. The disparity between image frames may be modeled based on image captures from the first and second image sensor for scenes at varying depths. The model may be used to predict disparity values for captured images, and those predicted disparity values used to reduce artefacts resulting from image sensor switching. The predicted disparity values may be used in image conditions resulting in erroneous actual disparity values.
Abstract:
Techniques and systems are provided for processing video data. For example, techniques and systems are provided for performing content-adaptive blob filtering. A number of blobs generated for a video frame is determined. A size of a first blob from the blobs is determined, the first blob including pixels of at least a portion of a first foreground object in the video frame. The first blob is filtered from the plurality of blobs when the size of the first blob is less than a size threshold. The size threshold is determined based on the number of the plurality of blobs generated for the video frame.
Abstract:
A method performed by an electronic device is described. The method includes obtaining a combined image. The combined image includes a combination of images captured from one or more image sensors. The method also includes obtaining depth information. The depth information is based on a distance measurement between a depth sensor and at least one object in the combined image. The method further includes adjusting a combined image visualization based on the depth information.
Abstract:
This disclosure provides devices, methods, computer-readable medium, and means for spatial alignment transform. In some aspects, a device may perform operations including capturing a first image of a scene using a first sensor at a first zoom ratio, capturing a second image of the scene using a second sensor at a second zoom ratio, the first image having a different field-of-view (FOV) than the second image, determining a translation matrix based on one or more spatial misalignments between the first image and the second image, determining a confidence associated with the translation matrix, in response to the confidence being greater than a confidence threshold, determining a weighting factor based on the first zoom ratio, the second zoom ratio, and a current zoom ratio of the device, applying the weighting factor to the translation matrix, and warping the first image to the second image using the weighted translation matrix.
Abstract:
A system and method of object detection are disclosed. In a particular implementation, a method of processing an image includes receiving, at a processor, image data associated with an image of a scene. The scene includes a road region. The method further includes detecting the road region based on the image data and determining a subset of the image data. The subset excludes at least a portion of the image data corresponding to the road region. The method further includes performing an object detection operation on the subset of the image data to detect an object. The object detection operation performed on the subset of the image data is exclusive of the at least a portion of the image data corresponding to the road region.
Abstract:
A computing device configured for providing an interface is described. The computing device includes a processor and instructions stored in memory. The computing device projects a projected image from a projector. The computing device also captures an image including the projected image using a camera. The camera operates in a visible spectrum. The computing device calibrates itself, detects a hand and tracks the hand based on a tracking pattern in a search space. The computing device also performs an operation.