Abstract:
An electronic device is described. The electronic device includes a processor. The processor is configured to obtain a plurality of images. The processor is also configured to obtain global motion information indicating global motion between at least two of the plurality of images. The processor is further configured to obtain object tracking information indicating motion of a tracked object between the at least two of the plurality of images. The processor is additionally configured to perform automatic zoom based on the global motion information and the object tracking information. Performing automatic zoom produces a zoom region including the tracked object. The processor is configured to determine a motion response speed for the zoom region based on a location of the tracked object within the zoom region.
Abstract:
A particular method includes determining, based on data received from at least one motion sensor, a movement of a mobile device from a first position to a second position. The method also includes computing a three-dimensional (3D) model of an object based on a first image of the object corresponding to a first view of the object from the first position of the mobile device, a second image of the object corresponding to a second view of the object from the second position of the mobile device, and the movement of the mobile device.
Abstract:
A method for three-dimensional face generation is described. An inverse depth map is calculated based on a depth map and an inverted first matrix. The inverted first matrix is generated from two images in which pixels are aligned vertically and differ horizontally. The inverse depth map is normalized to correct for distortions in the depth map caused by image rectification. A three-dimensional face model is generated based on the inverse depth map and one of the two images.
Abstract:
Apparatus and methods for facial detection are disclosed. A plurality of images of an observed face is received for identification. Based at least on two or more selected images of the plurality of images, a template of the observed face is generated. In some embodiments, the template is a subspace generated based on feature vectors of the plurality of received images. A database of identities and corresponding facial data of known persons is searched based at least on the template of the observed face and the facial data of the known persons. One or more identities of the known persons are selected based at least on the search.
Abstract:
A three dimensional (3D) mixed reality system combines a real 3D image or video, captured by a 3D camera for example, with a virtual 3D image rendered by a computer or other machine to render a 3D mixed-reality image or video. A 3D camera can acquire two separate images (a left and a right) of a common scene, and superimpose the two separate images to create a real image with a 3D depth effect. The 3D mixed-reality system can determine a distance to a zero disparity plane for the real 3D image, determine one or more parameters for a projection matrix based on the distance to the zero disparity plane, render a virtual 3D object based on the projection matrix, combine the real image and the virtual 3D object to generate a mixed-reality 3D image.
Abstract:
Present embodiments contemplate systems, apparatus, and methods to improve feature generation for object recognition. Particularly, present embodiments contemplate excluding and/or modifying portions of images corresponding to dispersed pixel distributions. By excluding and/or modifying these regions within the feature generation process, fewer unfavorable features are generated and computation resources may be more efficiently employed.
Abstract:
A method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engines. A speaker independent (SI) Hidden Markov Model (HMM) engine, a speaker independent Dynamic Time Warping (DTW-SI) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined. Combining and resolving the results of these engines results in a system with better recognition accuracy and lower rejection rates than using the results of only one engine.
Abstract:
A method and system that combines voice recognition engines and resolves any differences between the results of individual voice recognition engines. A speaker independent (SI) Hidden Markov Model (HMM) engine, a speaker independent Dynamic Time Warping (DTW-SI) engine and a speaker dependent Dynamic Time Warping (DTW-SD) engine are combined. Combining and resolving the results of these engines results in a system with better recognition accuracy and lower rejection rates than using the results of only one engine.