Abstract:
Video and corresponding metadata is accessed. Events of interest within the video are identified based on the corresponding metadata, and best scenes are identified based on the identified events of interest. A video summary can be generated including one or more of the identified best scenes. The video summary can be generated using a video summary template with slots corresponding to video clips selected from among sets of candidate video clips. Best scenes can also be identified by receiving an indication of an event of interest within video from a user during the capture of the video. Metadata patterns representing activities identified within video clips can be identified within other videos, which can subsequently be associated with the identified activities.
Abstract:
In a video capture system, a virtual lens is simulated when applying a crop or zoom effect to an input video. An input video frame is received from the input video that has a first field of view and an input lens distortion caused by a lens used to capture the input video frame. A selection of a sub-frame representing a portion of the input video frame is obtained that has a second field of view smaller than the first field of view. The sub-frame is processed to remap the input lens distortion to a desired lens distortion in the sub-frame. The processed sub-frame is the outputted.
Abstract:
A spherical content capture system captures spherical video and audio content. In one embodiment, captured metadata or video/audio processing is used to identify content relevant to a particular user based on time and location information. The platform can then generate an output video from one or more shared spherical content files relevant to the user. The output video may include a non-spherical reduced field of view such as those commonly associated with conventional camera systems. Particularly, relevant sub-frames having a reduced field of view may be extracted from each frame of spherical video to generate an output video that tracks a particular individual or object of interest. For each sub-frame, a corresponding portion of an audio track is generated that includes a directional audio signal having a directionality based on the selected sub-frame.
Abstract:
Systems and methods are provided that capture and process frames of frame data. An image sensor captures frames of frame data representative of light incident upon the image sensor using a rolling shutter and outputs the frames of frame data. The image sensor captures at least one of the frames over a frame capture interval and then waits over a blanking interval before capturing another frame. A buffer receives and stores the frames output by the image sensor. An image signal processor retrieves the frames from the buffer and processes the frames over successive frame processing intervals to generate a video having a time interval per frame greater than the frame capture interval. At least one of the successive frame processing intervals is greater than the frame capture interval and is less than or equal to a sum of the frame capture interval and the blanking interval.
Abstract:
Systems and methods are disclosed that capture and compress frames of pixel data. In an implementation, an image sensor chip is configured to convert light into pixel data and generate compressed pixel data at a variable compression rate including applying a transform to pixel data associated with a pixel category from a plurality of pixel categories. The variable compression rate is within an available bandwidth of an output bus configured to output the compressed pixel data.
Abstract:
A spherical content capture system captures spherical video content. A spherical video sharing platform enables users to share the captured spherical content and enables users to access spherical content shared by other users. In one embodiment, captured metadata provides proximity information indicating which cameras were in proximity to a target device during a particular time frame. The platform can then generate an output video from spherical video captured from those cameras. The output video may include a non-spherical reduced field of view such as those commonly associated with conventional camera systems. Particularly, relevant sub-frames having a reduced field of view may be extracted from frames of one or more spherical videos to generate an output video that tracks a particular individual or object of interest.
Abstract:
Systems and methods are provided that capture and process frames of frame data. An image sensor captures frames of frame data representative of light incident upon the image sensor using a rolling shutter and outputs the frames of frame data. The image sensor captures at least one of the frames over a frame capture interval and then waits over a blanking interval before capturing another frame. A buffer receives and stores the frames output by the image sensor. An image signal processor retrieves the frames from the buffer and processes the frames over successive frame processing intervals to generate a video having a time interval per frame greater than the frame capture interval. At least one of the successive frame processing intervals is greater than the frame capture interval and is less than or equal to a sum of the frame capture interval and the blanking interval.
Abstract:
Video clips and images captured by one device (e.g., a camera) are associated with one or more synchronization labels such as synchronization device labels and synchronization time labels determined by the device. Synchronization device labels can be used to identify devices that are synchronized. Synchronization time labels indicate relative timing between the devices that are synchronized. When a device is on, it transmits a synchronization signal and receives another synchronization signal transmitted by another device in response to receiving the synchronization signal it has transmitted. The two devices each calculate a synchronization device label and a synchronization time label using the synchronization signals and associate the synchronization device label and synchronization time label with video frames and images captured. Video clips and images can subsequently be aligned using the associated synchronization device labels and the synchronization time labels.
Abstract:
A system and method disposed to enable encoding, decoding and manipulation of digital video with substantially less processing load than would otherwise required. In particular, one disclosed method is directed to generating a compressed video data structure that is selectively decodable to a plurality of resolutions including the full resolution of the uncompressed stream. The desired number of data components and the content of the data components that make up the compressed video data, which determine the available video resolutions, are variable based upon the processing carried out and the resources available to decode and process the data components. During decoding, efficiency is substantially improved because only the data components necessary to generate a desired resolution are decoded. In variations, both temporal and spatial decoding are utilized to reduce frame rates, and hence, further reduce processor load. The system and method are particularly useful for real-time video editing applications.
Abstract:
Video and corresponding metadata is accessed. Events of interest within the video are identified based on the corresponding metadata, and best scenes are identified based on the identified events of interest. In one example, best scenes are identified based on the motion values associated with frames or portions of a frame of a video. Motion values are determined for each frame and portions of the video including frames with the most motion are identified as best scenes. Best scenes may also be identified based on the motion profile of a video. The motion profile of a video is a measure of global or local motion within frames throughout the video. For example, best scenes are identified from portion of the video including steady global motion. A video summary can be generated including one or more of the identified best scenes.