Abstract:
Examples include a determination how to manage storage of a video clip generated from recorded video based upon a sensor event. Managing storage of the video clip may include determining whether to save or delete the video clip based on an imprint associated with an object that indicates whether the object is included in the video clip.
Abstract:
An apparatus for video summarization using sematic information is described herein. The apparatus includes a controller, a scoring mechanism, and a summarizer. The controller is to segment an incoming video stream into a plurality of activity segments, wherein each frame is associated with an activity. The scoring mechanism is to calculate a score for each frame of each activity, wherein the score is based on a plurality of objects in each frame. The summarizer is to summarize the activity segments based on the score for each frame.
Abstract:
Generally discussed herein are systems and apparatuses for gesture-based augmented reality. Also discussed herein are methods of using the systems and apparatuses. According to an example a method may include detecting, in image data, an object and a gesture, in response to detecting the object in the image data, providing data indicative of the detected object, in response to detecting the gesture in the image data, providing data indicative of the detected gesture, and modifying the image data using the data indicative of the detected object and the data indicative of the detected gesture.
Abstract:
An apparatus comprising a memory to store an observed trajectory of a pedestrian, the observed trajectory comprising a plurality of observed locations of the pedestrian over a first plurality of timesteps; and a processor to generate a predicted trajectory of the pedestrian, the predicted trajectory comprising a plurality of predicted locations of the pedestrian over the first plurality of timesteps and over a second plurality of timesteps occurring after the first plurality of timesteps; determine a likelihood of the predicted trajectory based on a comparison of the plurality of predicted locations of the pedestrian over the first plurality of timesteps and the plurality of observed locations of the pedestrian over the first plurality of timesteps; and responsive to the determined likelihood of the predicted trajectory, provide information associated with the predicted trajectory to a vehicle to warn the vehicle of a potential collision with the pedestrian.
Abstract:
In one embodiment, an apparatus comprises a memory and a processor. The memory is to store visual data associated with a visual representation captured by one or more sensors. The processor is to: obtain the visual data associated with the visual representation captured by the one or more sensors, wherein the visual data comprises uncompressed visual data or compressed visual data; process the visual data using a convolutional neural network (CNN), wherein the CNN comprises a plurality of layers, wherein the plurality of layers comprises a plurality of filters, and wherein the plurality of filters comprises one or more pixel-domain filters to perform processing associated with uncompressed data and one or more compressed-domain filters to perform processing associated with compressed data; and classify the visual data based on an output of the CNN.
Abstract:
Vehicle navigation control systems in autonomous driving rely on accurate predictions of objects within the vicinity of the vehicle to appropriately control the vehicle safely through its surrounding environment. Accordingly this disclosure provides methods and devices which implement mechanisms for obtaining contextual variables of the vehicle's environment for use in determining the accuracy of predictions of objects within the vehicle's environment.
Abstract:
In one embodiment, an apparatus comprises a storage device and a processor. The storage device may store a plurality of compressed images comprising one or more compressed master images and one or more compressed slave images. The processor may: identify an uncompressed image; access context information associated with the uncompressed image and the one or more compressed master images; determine, based on the context information, whether the uncompressed image is associated with a corresponding master image; upon a determination that the uncompressed image is associated with the corresponding master image, compress the uncompressed image into a corresponding compressed image with reference to the corresponding master image; upon a determination that the uncompressed image is not associated with the corresponding master image, compress the uncompressed image into the corresponding compressed image without reference to the one or more compressed master images; and store the corresponding compressed image on the storage device.
Abstract:
In one embodiment, an apparatus comprises processing circuitry to: receive, via a communication interface, a compressed video stream captured by a camera, wherein the compressed video stream comprises: a first compressed frame; and a second compressed frame, wherein the second compressed frame is compressed based at least in part on the first compressed frame, and wherein the second compressed frame comprises a plurality of motion vectors; decompress the first compressed frame into a first decompressed frame; perform pixel-domain object detection to detect an object at a first position in the first decompressed frame; and perform compressed-domain object detection to detect the object at a second position in the second compressed frame, wherein the object is detected at the second position in the second compressed frame based on: the first position of the object in the first decompressed frame; and the plurality of motion vectors from the second compressed frame.
Abstract:
In one embodiment, an apparatus comprises a memory and a processor. The memory is to store visual data associated with a visual representation captured by one or more sensors. The processor is to: obtain the visual data associated with the visual representation captured by the one or more sensors, wherein the visual data comprises uncompressed visual data or compressed visual data; process the visual data using a convolutional neural network (CNN), wherein the CNN comprises a plurality of layers, wherein the plurality of layers comprises a plurality of filters, and wherein the plurality of filters comprises one or more pixel-domain filters to perform processing associated with uncompressed data and one or more compressed-domain filters to perform processing associated with compressed data; and classify the visual data based on an output of the CNN.
Abstract:
An apparatus for video summarization using sematic information is described herein. The apparatus includes a controller, a scoring mechanism, and a summarizer. The controller is to segment an incoming video stream into a plurality of activity segments, wherein each frame is associated with an activity. The scoring mechanism is to calculate a score for each frame of each activity, wherein the score is based on a plurality of objects in each frame. The summarizer is to summarize the activity segments based on the score for each frame.