Abstract:
A video processing method includes: receiving a first input frame with a 360-degree Virtual Reality (360 VR) projection format; applying first content-oriented rotation to the first input frame to generate a first content-rotated frame; encoding the first content-rotated frame to generate a first part of a bitstream, including generating a first reconstructed frame and storing a reference frame derived from the first reconstructed frame; receiving a second input frame with the 360 VR projection format; applying second content-oriented rotation to the second input frame to generate a second content-rotated frame; configuring content re-rotation according to the first content-oriented rotation and the second content-oriented rotation; applying the content re-rotation to the reference frame to generate a re-rotated reference frame; and encoding, by a video encoder, the second content-rotated frame to generate a second part of the bitstream, including using the re-rotated reference frame for predictive coding of the second content-rotated frame.
Abstract:
An apparatus for dynamically adjusting video decoding complexity includes a decoding resolution control circuit and an adaptive spatial resolution decoder. The decoding resolution control circuit is arranged to dynamically determine whether at least one portion of multiple frames should be decoded in accordance with a specific resolution differing from an original resolution of the frames. In addition, the adaptive spatial resolution decoder is arranged to decode the frames according to whether the at least one portion of the frames should be decoded in accordance with the specific resolution. In particular, the apparatus further includes a system capability analyzing circuit arranged to analyze system capability of at least a portion of the apparatus, in order to generate analyzing results for being sent to the decoding resolution control circuit. An associated method is also provided.
Abstract:
For omnidirectional video such as 360-degree Virtual Reality (360VR) video, a video system that support independent decoding of different views of the omnidirectional video is provided. A decoder for such a system can extract a specified part of a bitstream to decode a desired perspective/face/view of an omnidirectional image without decoding the entire image while suffering minimal or no loss in coding efficiency.
Abstract:
A method and apparatus for three-dimensional video coding or multi-view video coding are disclosed. Embodiments according to the present invention derive a unified disparity vector from depth information for Inter mode and Skip/Direct mode. The unified disparity vector is derived from a subset of depth samples in an associated depth block corresponding to the current block using a unified derivation method. The unified derivation method is applied in Inter mode, Skip mode, or Direct mode when a disparity vector derived from depth data is required for encoding or decoding. The unified disparity vector can also be applied to derive a disparity vector for locating a corresponding block, and thus an inter-view motion vector candidate can be determined for Skip mode or Direct mode.
Abstract:
A method and apparatus for three-dimensional video coding and multi-view video coding are disclosed. Embodiments according to the present invention derive a unified disparity vector (DV) based on neighboring blocks of the current block or depth information associated with the current block and locate a single corresponding block in a reference view according to the unified DV. An inter-view motion vector prediction (MVP) candidate is then derived for both list0 and list1 from the single corresponding block. List0 and list1 MVs of the inter-view MVP candidate are derived from the single corresponding block located according to the unified DV.
Abstract:
A method and apparatus for three-dimensional and scalable video coding are disclosed. Embodiments according to the present invention determine a motion information set associated with the video data, wherein at least part of the motion information set is made available or unavailable conditionally depending on the video data type. The video data type may correspond to depth data, texture data, a view associated with the video data in three-dimensional video coding, or a layer associated with the video data in scalable video coding. The motion information set is then provided for coding or decoding of the video data, other video data, or both. At least a flag may be used to indicate whether part of the motion information set is available or unavailable. Alternatively, a coding profile for the video data may be used to determine whether the motion information is available or not based on the video data type.
Abstract:
A method for improved binarization and entropy coding process of syntax related to depth coding is disclosed. In one embodiment, a first value associated with the current depth block is bypass coded, where the first value corresponds to the residual magnitude of a block coded by an Intra or Inter SDC mode, the delta magnitude of a block coded by a DMM mode, or a residual sign of a block coded by the Inter SDC mode. In another embodiment, a first bin of a binary codeword is coded using arithmetic coding and the rest bins of the binary codeword are coded using bypass coding. The codeword corresponds to the residual magnitude of a block coded by the Intra or Inter SDC mode, or the delta DC magnitude of a block coded by the DMM mode.
Abstract:
A video processing method includes receiving a reconstructed frame, and applying in-loop filtering, by at least one in-loop filter, to the reconstructed frame. The step of in-loop filtering includes performing a sample adaptive offset (SAO) filtering operation. The step of performing the SAO filtering operation includes keeping a value of a current pixel unchanged by blocking the SAO filtering operation of the current pixel included in the reconstructed frame from being applied across a virtual boundary defined in the reconstructed frame.
Abstract:
A video decoding method includes decoding a part of a bitstream to generate a decoded frame. The decoded frame is a projection-based frame that comprises at least one projection face and at least one guard band packed in a projection layout. At least a portion of a 360-degree content of a sphere is mapped to the at least one projection face via projection. The decoded frame is in a 4:2:0 chroma format or a 4:2:2 chroma format, and a guard band size of each of the at least one guard band is equal to an even number of luma samples.
Abstract:
A video processing method includes a step of receiving a bitstream, and a step of decoding a part of the bitstream to generate a decoded frame, including parsing a plurality of syntax elements from the bitstream. The decoded frame is a projection-based frame that includes a plurality of projection faces packed at a plurality of face positions with different position indexes in a hemisphere cubemap projection layout. A portion of a 360-degree content of a sphere is mapped to the plurality of projection faces via hemisphere cubemap projection. Values of the plurality of syntax elements are indicative of face indexes of the plurality of projection faces packed at the plurality of face positions, respectively, and are constrained to meet a requirement of bitstream conformance.