Cluster of scalar engines to accelerate intersection in leaf node

    公开(公告)号:US11263799B2

    公开(公告)日:2022-03-01

    申请号:US16910434

    申请日:2020-06-24

    Abstract: Cluster of acceleration engines to accelerate intersections. For example, one embodiment of an apparatus comprises: a set of graphics cores to execute a first set of instructions of a primary graphics thread; a scalar cluster comprising a plurality of scalar execution engines; and a communication fabric interconnecting the set of graphics cores and the scalar cluster; the set of graphics cores to offload execution of a second set of instructions associated with ray traversal and/or intersection operations to the scalar cluster; the scalar cluster comprising a plurality of local memories, each local memory associated with one of the scalar execution engines, wherein each local memory is to store a portion of a hierarchical acceleration data structure required by an associated scalar execution engine to execute one or more of the second set of instructions; the plurality of scalar execution engines to store results of the execution of the second set of instructions in a memory accessible by the set of graphics cores; wherein the set of graphics cores are to process the results within the primary graphics thread.

    Mutli-frame renderer
    46.
    发明授权

    公开(公告)号:US11132759B2

    公开(公告)日:2021-09-28

    申请号:US16237987

    申请日:2019-01-02

    Abstract: An embodiment of a graphics command coordinator apparatus may include a commonality identifier to identify a commonality between a first graphics command corresponding to a first frame and a second graphics command corresponding to a second frame, a commonality analyzer communicatively coupled to the commonality identifier to determine if the first graphics command and the second graphics command can be processed together based on the commonality identified by the commonality identifier, and a commonality indicator communicatively coupled to the commonality analyzer to provide an indication that the first graphics command and the second graphics command are to be processed together. Other embodiments are disclosed and claimed.

    APPARATUS AND METHOD FOR THROTTLING A RAY TRACING PIPELINE

    公开(公告)号:US20210295583A1

    公开(公告)日:2021-09-23

    申请号:US16820483

    申请日:2020-03-16

    Abstract: Apparatus and method for stack throttling. For example, one embodiment of an apparatus comprises: execution circuitry comprising a plurality of functional units to execute a plurality of ray shaders and generate a plurality of primary rays and a corresponding plurality of ray messages; a first in first out (FIFO) buffer to queue the ray messages generated by the EUs; a cache to store one or more of the plurality of primary rays; a memory-backed stack to store a first subset of the plurality of ray messages in a corresponding plurality of entries; memory-backed stack management circuitry to either store a second subset of the plurality of ray messages to the memory-backed stack, or to temporarily store the one or more the second subset of the plurality of ray messages to a memory subsystem based, at least in part, on a number of entries currently occupied by ray messages in the memory-backed stack; and ray traversal circuitry to read a next ray message from the memory-backed stack, retrieve a next primary ray identified by the ray message from the cache or a memory subsystem, and perform traversal operations on the next primary ray.

    Apparatus and method for asynchronous ray tracing

    公开(公告)号:US11087522B1

    公开(公告)日:2021-08-10

    申请号:US16819121

    申请日:2020-03-15

    Abstract: Apparatus and method for asynchronous ray tracing. For example, one embodiment of a processor comprises: a bounding volume hierarchy (BVH) generator to construct a BVH comprising a plurality of hierarchically arranged nodes including a root node, a plurality of internal nodes, and a plurality of leaf nodes comprising primitives, wherein each internal node comprises a child node to either the root node or another internal node and each leaf node comprises a child node to an internal node; a first storage bank to be arranged as a first plurality of entries; a second storage bank to be arranged as a second plurality of entries, wherein each entry of the first plurality of entries and the second plurality of entries is to store a ray to be traversed through the BVH; an allocator circuit to distribute an incoming ray to either the first storage bank or the second storage bank based on a relative numbers of rays currently stored in the first and second storage banks; and traversal circuitry to alternate between selecting a next ray from the first storage bank and the second storage bank, the traversal circuitry to traverse the next ray through the BVH by reading a next BVH node from a top of a BVH node stack and determining whether the next ray intersects the next BVH node.

Patent Agency Ranking