Abstract:
Techniques are described for stereoscopic view generation. A graphics processing unit (GPU) may combine attribute information for two or more corresponding vertices of corresponding primitives in different views. The GPU may process the combined attributed information to generate graphics data for the stereoscopic view.
Abstract:
In an example, a method for rendering graphics data includes rendering pixels of a first bin of a plurality of bins, wherein the pixels of the first bin are associated with a first portion of an image, and rendering, to the first bin, one or more pixels that are located outside the first portion of the image and associated with a second, different bin of the plurality of bins. The method also includes rendering the one or more pixels associated with the second bin to the second bin, such that the one or more pixels are rendered to both the first bin and the second bin.
Abstract:
The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may receive a plurality of workloads based on a workload order, each of the plurality of workloads being received in the workload order including at least a first workload and a second workload. The apparatus may also allocate one or more workloads of the plurality of workloads to one or more wave slots. Additionally, the apparatus may execute the one or more allocated workloads at the one or more wave slots, such that at least the first workload is executed at the first wave slot and the second workload is executed at the second wave slot. The apparatus may also allocate at least one other workload of the plurality of workloads to at least one previously-allocated wave slot of the one or more wave slots.
Abstract:
The present disclosure relates to methods and devices for graphics processing including an apparatus, e.g., a GPU. The apparatus may generate a table including a plurality of entries to store data associated with at least one of a constant value or an immediate value. The apparatus may also process, upon generating the table, first data including at least one of a constant value or an immediate value. Further, the apparatus may store, in the generated table, at least one of the constant value or the immediate value of the first data. The apparatus may also transmit, upon storing at least one of the constant value or the immediate value in the table, the table including the stored at least one of the constant value or the immediate value of the first data.
Abstract:
A graphics processing unit (GPU) utilizes block general purpose registers (bGPRs) to load multiple waves of samples for an instruction group into a processing pipeline and receive processed samples from the pipeline. The GPU acquires a credit for the bGPR for execution of the instruction group for a first wave using a persistent GPR and the bGPR. The GPU refunds the credit upon loading the first wave into the pipeline. The GPU executes a subsequent wave for the instruction group to load samples to the pipeline when at least one credit is available and the pipeline is processing the first wave. The GPU stores an indication of each wave that has been loaded into the pipeline in a queue. The GPU returns samples for a next wave in the queue from the pipeline to the bGPR for further processing when the physical slot of the bGPR is available.
Abstract:
This disclosure describes techniques for executing shader programs in a graphics processing unit (GPU). In some examples, the techniques for executing shader programs may include executing, with a shader unit of a graphics processor, a shader program that performs vertex shader processing and that generates multiple output vertices for each input vertex that is received by the shader program. In further examples, the techniques for executing shader programs may include executing a merged vertex/geometry shader program using a non-replicated mode of execution. The non-replicated mode of execution may involve assigning each of a plurality of primitives to one merged vertex/geometry shader program instance per primitive and causing each of the instances to output a plurality of vertices. In additional examples, the techniques for executing shader programs may include techniques for selecting one of a non-replicated mode and a replicated mode for executing a merged vertex/geometry shader program.
Abstract:
The present disclosure relates to methods and apparatus for graphics processing, e.g., a GPU. The apparatus may receive an image including a plurality of pixels associated with one or more workgroups and one or more pixel tiles, each of the workgroups and the pixel tiles including one or more pixels of the plurality of pixels. The apparatus may determine whether the one or more workgroups are misaligned with the one or more pixel tiles. The apparatus may determine a conversion order of the one or more workgroups when the one or more workgroups are misaligned with the one or more pixel tiles, the conversion order corresponding to a common multiple of one of the one or more workgroups and one of the one or more pixel tiles. The apparatus may convert each of the one or more workgroups based on the conversion order of the one or more workgroups.
Abstract:
This disclosure provides systems, devices, apparatus, and methods, including computer programs encoded on storage media, for fast incremental shared constants. In aspects, a CPU may determine/update shared constant data for a first draw call of a plurality of draw calls. The shared constant data, which may correspond to at least one shader, may be updated based on a draw call update for the first draw call. The CPU may communicate the updated shared constant data for the first draw call to a GPU. The GPU may receive, in at least one register, the updated shared constant data from the CPU and configure the at least one register based on the updated shared constant data corresponding to the draw call update of the first draw call of the plurality of draw calls.
Abstract:
Techniques are described with respect to preemption in which a graphics processing unit (GPU) may execute a first set of commands in response to receiving a draw call, the draw call defining a plurality of primitives that are to be rendered by the first set of commands, receive a preemption notification during execution of the first set of commands, and preempt the execution of the first set of commands, prior to completing the execution of the first set of commands to render the plurality of primitives of the draw call, for executing a second set of commands.
Abstract:
In an example, a method for rendering graphics data includes rendering pixels of a first bin of a plurality of bins, wherein the pixels of the first bin are associated with a first portion of an image, and rendering, to the first bin, one or more pixels that are located outside the first portion of the image and associated with a second, different bin of the plurality of bins. The method also includes rendering the one or more pixels associated with the second bin to the second bin, such that the one or more pixels are rendered to both the first bin and the second bin.