Abstract:
The introduction of an “out-of-memory” marker in the sorted tile geometry sequence for a tile may aid in handling out-of-memory frames. This marker allows hardware to continue rendering using the original data stream instead of the sorted data stream. This enables use of the original data stream allows the system to continue rendering without requiring any driver intervention. During the visibility generation/sorting phase, the number of memory pages required for storing the data for a rendering pass is continuously tracked. This tracking includes tracking the pages that are required even if the hardware had not run out-of-memory. This information can be monitored by a graphics driver and the driver can provide more memory pages for the system to work at full efficiency.
Abstract:
An apparatus and method to execute ray tracing instructions. For example, one embodiment of an apparatus comprises execution circuitry to execute a dequantize instruction to convert a plurality of quantized data values to a plurality of dequantized data values, the dequantize instruction including a first source operand to identify a plurality of packed quantized data values in a source register and a destination operand to identify a destination register in which to store a plurality of packed dequantized data values, wherein the execution circuitry is to convert each packed quantized data value in the source register to a floating point value, to multiply the floating point value by a first value to generate a first product and to add the first product to a second value to generate a dequantized data value, and to store the dequantized data value in a packed data element location in the destination register.
Abstract:
Apparatus and method for acceleration data structure refit. For example, one embodiment of an apparatus comprises: a ray generator to generate a plurality of rays in a first graphics scene; a hierarchical acceleration data structure generator to construct an acceleration data structure comprising a plurality of hierarchically arranged nodes including inner nodes and leaf nodes stored in a memory in a depth-first search (DFS) order; traversal hardware logic to traverse one or more of the rays through the acceleration data structure; intersection hardware logic to determine intersections between the one or more rays and one or more primitives within the hierarchical acceleration data structure; a node refit unit comprising circuitry and/or logic to read consecutively through at least the inner nodes in the memory in reverse DFS order to perform a bottom-up refit operation on the hierarchical acceleration data structure.
Abstract:
Examples described herein relate to a graphics processing apparatus that includes a memory device; and a central processing unit (CPU). In some examples, the CPU is configured to: execute a producer to issue graphics command application program interfaces (APIs); execute a driver to translate graphics command APIs into executable instructions; and based on an idle state of the producer, execute a command translation code segment of the producer to translate graphics command APIs into executable instructions. In some examples, the execution unit is coupled to the memory device, the execution unit to execute one or more of the executable instructions. In some examples, the producer includes multiple portions such as application code, graphics pipeline runtime code, and command translation code segment.
Abstract:
Embodiments described herein are generally directed to a local cache structure within a shared function of a 3D pipeline that facilitates efficient caching of resource state. In an example, the cache structure is maintained within a sub-core of a GPU. The local cache structure includes (i) an SC having entries each containing a state of a binded resource, and (ii) a DSAT having entries each containing an index into the SC. The DSAT is tagged by SBTO values representing addresses of entries of a binding table. A request, including information indicative of an SBTO pointing to an entry within the binding table, is received for a state of a particular binded resource being accessed by a shared function of the 3D pipeline. Based on the SBTO and during a single access to the cache structure, a determination is made regarding whether the state of the particular binded resource is present.
Abstract:
The systems, apparatuses and methods may provide a way to adaptively process and aggressively cull geometry data. Systems, apparatuses and methods may provide for processing, by a positional only shading pipeline (POSH), geometry data including surface triangles for a digital representation of a scene. More particularly, systems, apparatuses and methods may provide a way to identify surface triangles in one or more exclusion zones and non-exclusion zones, and cull surface triangles in one or more exclusion zones.
Abstract:
An apparatus to facilitate an update of shader data constants. The apparatus includes one or more processors to detect a change to one or more data constants in a shader program, generate a micro-code block including updated constants data during execution of the shader program and transmit the micro-code block to the shader program.
Abstract:
Systems, apparatuses and methods may provide for technology that determines a frame rate of video content, sets a blend amount parameter based on the frame rate, and temporally anti-aliases the video content based on the blend amount parameter. Additionally, the technology may detect a coarse pixel (CP) shading condition with respect to one or more frames in the video content and select, in response to the CP shading condition, a per frame jitter pattern that jitters across pixels, wherein the video content is temporally anti-aliased based on the per frame jitter pattern. The CP shading condition may also cause the technology to apply a gradient to a plurality of color planes on a per color plane basis and discard pixel level samples associated with a CP if all mip data corresponding to the CP is transparent or shadowed out.
Abstract:
An embodiment of a graphics command coordinator apparatus may include a commonality identifier to identify a commonality between a first graphics command corresponding to a first frame and a second graphics command corresponding to a second frame, a commonality analyzer communicatively coupled to the commonality identifier to determine if the first graphics command and the second graphics command can be processed together based on the commonality identified by the commonality identifier, and a commonality indicator communicatively coupled to the commonality analyzer to provide an indication that the first graphics command and the second graphics command are to be processed together. Other embodiments are disclosed and claimed.
Abstract:
Systems, apparatuses and methods may provide for technology that activates a first context on a graphics processor and detects a context switch condition with respect to the first context. Additionally, a second context may be activated, in response to the context switch condition, on the graphics processor while the first context is active on the graphics processor. In one example, activating the second context includes adding a group identifier to a plurality of threads corresponding to the second context and launching the plurality of threads with the group identifier on the graphics processor.