Abstract:
Method and apparatus for deriving a motion vector at a video decoder. A block-based motion vector may be produced at the video decoder by utilizing motion estimation among available pixels relative to blocks in one or more reference frames. The available pixels could be, for example, spatially neighboring blocks in the sequential scan coding order of a current frame, blocks in a previously decoded frame, or blocks in a downsampled frame in a lower pyramid when layered coding has been used.
Abstract:
A video encoder may use an adaptive Wiener filter inside the core video encoding loop to improve coding efficiency. In one embodiment, the Wiener filter may be on the input to a motion estimation unit and, in another embodiment, it may be on the output of a motion compensation unit. The taps for the Wiener filter may be determined based on characteristics of at least a region of pixel intensities within a picture. Thus, the filtering may be adaptive in that it varies based on the type of video being processed.
Abstract:
A video encoder may use an adaptive Wiener filter inside the core video encoding loop to improve coding efficiency. In one embodiment, the Wiener filter may be on the input to a motion estimation unit and, in another embodiment, it may be on the output of a motion compensation unit. The taps for the Wiener filter may be determined based on characteristics of at least a region of pixel intensities within a picture. Thus, the filtering may be adaptive in that it varies based on the type of video being processed.
Abstract:
Reconstructed picture quality for a video codec system may be improved by categorizing reconstructed pixels into different histogram bins with histogram segmentation and then applying different filters on different bins. Histogram segmentation may be performed by averagely dividing the histogram into M bins or adaptively dividing the histogram into N bins based on the histogram characteristics. Here M and N may be a predefined, fixed, non-negative integer value or an adaptively generated value at encoder side and may be sent to decoder through the coded bitstream.
Abstract:
A three-dimensional (3D) video codec encodes multiple views of a 3D video, each including texture and depth components. The encoders of the codec encode video blocks of their respective views based on a set of prediction parameters, such as quad-tree split flags, prediction modes, partition sizes, motion fields, inter directions, reference indices, luma intra modes, and chroma intra modes. The prediction parameters may be inherited across different views and different ones of the texture and depth components.
Abstract:
Video compression encoding includes intra and inter prediction to reduce spatial and temporal redundancies in video. Prediction results or residuals represent differences between original video pixel values and predicted pixel values. The prediction residuals may be transformed into coefficients, referred to as transform coefficients, in the frequency domain. The transform coefficients may be quantized and entropy encoded. The transform coefficients can be sub-sampled prior to quantization to reduce their number. For example, sub-sampling may reduce more high frequency components than low frequency components represented in the transform coefficients. Therefore, sub-sampling reduces the number of transform coefficients that need to be quantized, reduces quantization complexity, and correspondingly increases throughput in the encoding.
Abstract:
An example apparatus for enhancing video includes a decoder to decode a received 360-degree projection format video bitstream to generate a decoded 360-degree projection format video. The apparatus also includes a viewport generator to generate a viewport from the decoded 360-degree projection format video. The apparatus further includes a convolutional neural network (CNN)-based filter to remove an artifact from the viewport to generate an enhanced image. The apparatus further includes a displayer to send the enhanced image to a display.
Abstract:
Disclosed examples populate a supplemental enhancement information (SEI) message with a value corresponding to a number of subpictures of a video sequence; populate a level indicator in the SEI message; populate a subpicture identifier; populate the subpicture identifier in a slice header, and cause the SEI message and the slice header to be included in a video bitstream.
Abstract:
Embodiments described herein are generally directed to improvements relating to power, latency, bandwidth and/or performance issues relating to GPU processing/caching. According to one embodiment, a system includes a producer intellectual property (IP) (e.g., a media IP), a compute core (e.g., a GPU or an AI-specific core of the GPU), a streaming buffer logically interposed between the producer IP and the compute core. The producer IP is operable to consume data from memory and output results to the streaming buffer. The compute core is operable to perform AI inference processing based on data consumed from the streaming buffer and output AI inference processing results to the memory.