Abstract:
Aspects of the present disclosure provide techniques for reducing latency and improving image quality of a viewport extracted from multi-directional video communications. According to such techniques, first streams of coded video data are received from a source. The first streams include coded data for each of a plurality of tiles representing a multi-directional video, where each tile corresponding to a predetermined spatial region of the multi-directional video, and at least one tile of the plurality of tiles in the first streams contains a current viewport location at a receiver. The techniques include decoding the first streams and displaying the tile containing the current viewport location. When the viewport location at the receiver changes to include a new tile of the plurality of tiles, retrieving and decoding first streams for the new tile, displaying the decoded content for the changed viewport location, and transmitting the changed viewport location to the source.
Abstract:
Techniques are disclosed for coding high dynamic range (HDR) data. According to such techniques, HDR data may be converted to a domain of uniform luminance data. The uniform domain data may be coded by motion compensated predictive coding. The HDR data also may be coded by motion compensated predictive coding, using a coding parameter that is derived from a counterpart coding parameter of the coding of the uniform domain data. In another technique, HDR data may be coded using coding parameters that are derived from HDR domain processing but distortion measurements may be performed in a uniform domain.
Abstract:
Techniques are disclosed for managing display of content from multi-view video data. According to these techniques, an object may be identified from content of the multi-view video. The object's location may be tracked across a sequence of multi-view video. The technique may extract a sub-set of video that is contained within a view window that is shifted in an image space of the multi-view video in correspondence to the tracked object's location. These techniques may be implemented either in an image source device or an image sink device.
Abstract:
Chroma deblock filtering of reconstructed video samples may be performed to remove blockiness artifacts and reduce color artifacts without over-smoothing. In a first method, chroma deblocking may be performed for boundary samples of a smallest transform size, regardless of partitions and coding modes. In a second method, chroma deblocking may be performed when a boundary strength is greater than 0. In a third method, chroma deblocking may be performed regardless of boundary strengths. In a fourth method, the type of chroma deblocking to be performed may be signaled in a slice header by a flag. Furthermore, luma deblock filtering techniques may be applied to chroma deblock filtering.
Abstract:
During video coding, frame rate conversion (FRC) capabilities of a decoder may be estimated. Based on the estimated FRC capabilities, an encoder may select a frame rate for a video coding session and may alter a frame rate of source video to match the selected frame rate. Thereafter, the resultant video may be coded and output to a channel. By incorporating knowledge of a decoder's FRC capabilities as source video is being coded, an encoder may reduce the frame rate of source video opportunistically. Bandwidth that is conserved by avoiding coding of video data in excess of the selected frame rate may be directed to coding of the remaining video at a higher bitrate, which can lead to increased quality of the coding session as a whole.
Abstract:
A video coder defines multiple fidelity regions in different spatial areas of a video sequence, each of which may have different fidelity characteristics. The coder may code the different representations in a common video sequence. Where prediction data crosses boundaries between the regions, interpolation may be performed to create like kind representations between prediction data and video content being coded.
Abstract:
Techniques for coding video data are described that maintain high precision coding for low motion video content. Such techniques include determining whether a source video sequence to be coded has low motion content. When the source video sequence contains low motion content, the video sequence may be coded as a plurality of coded frames using a chain of temporal prediction references among the coded frames. Thus, a single frame in the source video sequence is coded as a plurality of frames. Because the coded frames each represent identical content, the quality of coding should improve across the plurality of frames. Optionally, the disclosed techniques may increase the resolution at which video is coded to improve precision and coding quality.
Abstract:
Techniques are disclosed for overcoming communication lag between interactive operations among devices in a streaming session. According to the techniques, a first device streaming video content to a second device and an annotation is entered to a first frame being displayed at the second device, which is communicated back to the first device. Responsive to a communication that identifies the annotation, a first device may identify an element of video content from the first frame to which the annotation applies and determine whether the identified element is present in a second frame of video content currently displayed at the first terminal. If so, the first device may display the annotation with the second frame in a location where the identified element is present. If not, the first device may display the annotation via an alternate technique.
Abstract:
The invention is directed to an efficient way for encoding and decoding video. Embodiments include identifying different coding units that share a similar characteristic. The characteristic can be, for example: quantization values, modes, block sizes, color space, motion vectors, depth, facial and non-facial regions, and filter values. An encoder may then group the units together as a coherence group. An encoder may similarly create a table or other data structure of the coding units. An encoder may then extract the commonly repeating characteristic or attribute from the coding units. The encoder may transmit the coherence groups along with the data structure, and other coding units which were not part of a coherence group. The decoder may receive the data, and utilize the shared characteristic by storing locally in cache, for faster repeated decoding, and decode the coherence group together.
Abstract:
Embodiments of the present disclosure provide systems and methods for perspective shifting in a video conferencing session. In one exemplary method, a video stream may be generated. A foreground element may be identified in a frame of the video stream and distinguished from a background element of the frame. Data may be received representing a viewing condition at a terminal that will display the generated video stream. The frame of the video stream may be modified based on the received data to shift of the foreground element relative to the background element. The modified video stream may be displayed at the displaying terminal.