Abstract:
Frame packing techniques are disclosed for multi-directional images and video. According to an embodiment, a multi-directional source image is reformatted into a format in which image data from opposing fields of view are represented in respective regions of the packed image as flat image content. Image data from a multi-directional field of view of the source image between the opposing fields of view are represented in another region of the packed image as equirectangular image content. It is expected that use of the formatted frame will lead to coding efficiencies when the formatted image is processed by predictive video coding techniques and the like.
Abstract:
Techniques are described for implementing format configurations for multi-directional video and for switching between them. Source images may be assigned to formats that may change during a coding session. When a change occurs between formats, video coders and decoder may transform decoded reference frames from the first format to the second format. Thereafter, new frames in the second configuration may be coded or decoded predictively using transformed reference frame(s) as source(s) of prediction. In this manner, video coders and decoders may use intra-coding techniques and achieve high efficiency in coding.
Abstract:
Techniques presented herein provide an improved relay user experience and improved management of scarce computing and network resources as the number of relay endpoints increases. A sourcing endpoint device may generate a media feed, such as video and/or audio feed, representing contribution from a conference participant. The sourcing endpoint device may generate a priority value for the media feed, and the priority value may be transmitted to other members of the relay along with the input feed. Priority values of the different relay participants may be used by other devices, for example, intermediate servers or receiving endpoint devices, to manage aspects of the relay. For example, a relay server may prune streams from select endpoint devices based on relative priority values received from those devices. Alternatively, receiving endpoint devices may alter presentation of received feeds based on their associated priority values.
Abstract:
A device implementing a system for audio-video conferencing using multiple stream identifiers includes a processor configured to receive, from a sending device, indication of a first content stream and a first stream identifier, and indication of a second content stream and a second stream identifier associated. The first content stream and the second content stream correspond to different bit rates of streaming content. The processor is configured to receive, from a receiving device, a request to subscribe to the second content stream, the request including the second stream identifier, and receive, from the sending device, an indication that the second stream identifier has been associated with the first content stream. The processor is configured to forward, to the receiving device, the first content stream based on the request to subscribe to the second content stream and on the indication that the second stream identifier has been associated with the first content stream.
Abstract:
Chroma deblock filtering of reconstructed video samples may be performed to remove blockiness artifacts and reduce color artifacts without over-smoothing. In a first method, chroma deblocking may be performed for boundary samples of a smallest transform size, regardless of partitions and coding modes. In a second method, chroma deblocking may be performed when a boundary strength is greater than 0. In a third method, chroma deblocking may be performed regardless of boundary strengths. In a fourth method, the type of chroma deblocking to be performed may be signaled in a slice header by a flag. Furthermore, luma deblock filtering techniques may be applied to chroma deblock filtering.
Abstract:
Techniques are disclosed for coding video data predictively based on predictions made from spherical-domain projections of input pictures to be coded and reference pictures that are prediction candidates. Spherical projection of an input picture and the candidate reference pictures may be generated. Thereafter, a search may be conducted for a match between the spherical-domain representation of a pixel block to be coded and a spherical-domain representation of the reference picture. On a match, an offset may be determined between the spherical-domain representation of the pixel block to a matching portion of the of the reference picture in the spherical-domain representation. The spherical-domain offset may be transformed to a motion vector in a source-domain representation of the input picture, and the pixel block may be coded predictively with reference to a source-domain representation of the matching portion of the reference picture.
Abstract:
Techniques for coding video data estimate depths of different elements within video content and identify regions within the video content based on the estimated depths. One of the regions may be assigned as an area of interest. Thereafter, video content of a region that is not an area of interest may be masked out and the resultant video content obtained from the masking may be coded. The coded video content may be transmitted to a channel. These techniques permit a coding terminal to mask out captured video content prior to coding in order to support coding policies that account for privacy interests or video composition features during a video coding session.
Abstract:
Techniques are disclosed for managing memory allocations when coding video data according to multiple codec configurations. According to these techniques, devices may negotiate parameters of a coding session that include parameters of a plurality of different codec configurations that may be used during the coding session. A device may estimate sizes of decoded picture buffers for each of the negotiated codec configurations and allocate in its memory a portion of memory sized according to a largest size of the estimated decoded picture buffers. Thereafter, the devices may exchange coded video data. The exchange may involve decoding coded data of reference pictures and storing the decoded reference pictures in the allocated memory. During the coding session, the devices may toggle among the different negotiated codec configurations. As they do, reallocations of memory may be avoided.
Abstract:
Techniques are disclosed for overcoming communication lag between interactive operations among devices in a streaming session. According to the techniques, a first device streaming video content to a second device and an annotation is entered to a first frame being displayed at the second device, which is communicated back to the first device. Responsive to a communication that identifies the annotation, a first device may identify an element of video content from the first frame to which the annotation applies and determine whether the identified element is present in a second frame of video content currently displayed at the first terminal. If so, the first device may display the annotation with the second frame in a location where the identified element is present. If not, the first device may display the annotation via an alternate technique.
Abstract:
Techniques are described for responding to changes in bandwidth that are available to transmit coded video data between an encoder and a decoder. When such changes in bandwidth occur, estimates may be derived of visual significance of coded video data that has not yet been transmitted and also video data that is next to be coded. These estimates may be compared to each other. When the estimated visual significance of the coded video data that has not yet been transmitted is greater than the estimated visual significance of the video data that is next to be coded, transmission of the coded video data that has not yet been transmitted may be prioritized over coding of the video data that is next to be coded. When the estimated visual significance of the video data that is next to be coded is greater than the estimated visual significance of the coded video data that has not yet been transmitted, coding of the video data that is next to be coded may be prioritized over transmission of the coded video data that has not yet been transmitted. Resources may be allocated to the prioritized coder operation.