CLUSTER OF SCALAR ENGINES TO ACCELERATE INTERSECTION IN LEAF NODE

    公开(公告)号:US20220254090A1

    公开(公告)日:2022-08-11

    申请号:US17677118

    申请日:2022-02-22

    Abstract: Cluster of acceleration engines to accelerate intersections. For example, one embodiment of an apparatus comprises: a set of graphics cores to execute a first set of instructions of a primary graphics thread; a scalar cluster comprising a plurality of scalar execution engines; and a communication fabric interconnecting the set of graphics cores and the scalar cluster; the set of graphics cores to offload execution of a second set of instructions associated with ray traversal and/or intersection operations to the scalar cluster; the scalar cluster comprising a plurality of local memories, each local memory associated with one of the scalar execution engines, wherein each local memory is to store a portion of a hierarchical acceleration data structure required by an associated scalar execution engine to execute one or more of the second set of instructions; the plurality of scalar execution engines to store results of the execution of the second set of instructions in a memory accessible by the set of graphics cores; wherein the set of graphics cores are to process the results within the primary graphics thread.

    METHOD AND APPARATUS FOR PARALLEL PIXEL SHADING
    34.
    发明申请
    METHOD AND APPARATUS FOR PARALLEL PIXEL SHADING 有权
    并行像素着色的方法和装置

    公开(公告)号:US20150348222A1

    公开(公告)日:2015-12-03

    申请号:US14292064

    申请日:2014-05-30

    CPC classification number: G06T1/20

    Abstract: An apparatus and method for identifying sub-groups of execution resources for parallel pixel processing. For example, one embodiment of a method comprises: determining X and Y coordinates for a pixel block to be processed; performing a first set of one or more modulus operations using even bits from the X and Y coordinates to generate a first intermediate result; performing a second set of one or more modulus operations using odd bits from the X and Y coordinates to generate a second intermediate result; comparing the first intermediate result and the second intermediate result to generate a final result; and using the final result to select a first set of processing resources from a set of N processing resources for processing the pixel block.

    Abstract translation: 一种用于识别用于并行像素处理的执行资源的子组的装置和方法。 例如,方法的一个实施例包括:确定要处理的像素块的X和Y坐标; 使用来自X和Y坐标的偶数位执行第一组一个或多个模运算,以产生第一中间结果; 使用来自X和Y坐标的奇数位执行第二组一个或多个模运算,以产生第二中间结果; 比较第一中间结果和第二中间结果以产生最终结果; 以及使用最终结果从用于处理像素块的一组N个处理资源中选择第一组处理资源。

    APPARATUS AND METHOD FOR PERFORMING A STABLE AND SHORT LATENCY SORTING OPERATION

    公开(公告)号:US20240265487A1

    公开(公告)日:2024-08-08

    申请号:US18433823

    申请日:2024-02-06

    Abstract: Apparatus and method for stable and short latency sorting. For example, one embodiment of a processor comprises: an input circuit to receive a set of N input values to be sorted into a sorted order; comparison circuitry to compare each input value with all other input values in parallel to generate at least N*(N−1)/2 comparison result values; matrix generation circuitry and/or logic to generate a result matrix having a row associated with each input value, a plurality of bits in each row comprising comparison result values indicating results of comparisons with other input values, wherein a first region of the result matrix is to store a first set of bits comprising the N*(N−1)/2 comparison result values and a second region of the result matrix, opposite the first region, is to store a second set of bits comprising an inverse of the N*(N−1)/2 comparison result values; a parallel adder circuit to perform parallel additions of the bits in each row to generate N unique result values; and sorting circuitry to index into the N unique result values to return the sorted order.

    DISTRIBUTED COMPRESSION/DECOMPRESSION SYSTEM
    36.
    发明公开

    公开(公告)号:US20230205704A1

    公开(公告)日:2023-06-29

    申请号:US17561652

    申请日:2021-12-23

    CPC classification number: G06F12/0897 G06F2212/401

    Abstract: A graphics processor includes multiple levels of memory units, including a memory device and a cache device located near a graphics component. The graphics processor includes distributed compression/decompression, including a module between the cache device and the memory device. The module can perform compression of write data when the write data is moved from the cache device to the memory device, and perform decompression of read data when the read data is moved from the memory device to the cache device. The graphics processor can include a second level of cache with another compression module between the first level of cache and the second level of cache.

    APPARATUS AND METHOD FOR RAY TRACING WITH GRID PRIMITIVES

    公开(公告)号:US20210407177A1

    公开(公告)日:2021-12-30

    申请号:US17368335

    申请日:2021-07-06

    Abstract: Apparatus and method for ray tracing acceleration using a grid primitive. For example, one embodiment of an apparatus comprises: a grid primitive generator to generate a grid primitive comprising a plurality of adjacent interconnected primitives; a bitmask generator to generate a bitmask associated with the grid primitive, the bitmask comprising a plurality of bitmask values, each mask value associated with a primitive of the grid primitive; a ray tracing engine comprising traversal and intersection hardware logic to perform traversal and intersection operations in which rays are traversed through a hierarchical acceleration data structure and intersections between the rays and one or more of the adjacent interconnected primitives identified, wherein the ray tracing engine is to read the bitmask to determine a first set of primitives from the grid primitive on which to perform the traversal and intersection operations and a second set of primitives from the grid primitive on which the traversal and intersection operations will not be performed.

    APPARATUS AND METHOD FOR PERFORMING A STABLE AND SHORT LATENCY SORTING OPERATION

    公开(公告)号:US20210295463A1

    公开(公告)日:2021-09-23

    申请号:US16823741

    申请日:2020-03-19

    Abstract: Apparatus and method for stable and short latency sorting. For example, one embodiment of a processor comprises: an input circuit to receive a set of N input values to be sorted into a sorted order; comparison circuitry to compare each input value with all other input values in parallel to generate at least N*(N−1)/2 comparison result values; matrix generation circuitry and/or logic to generate a result matrix having a row associated with each input value, a plurality of bits in each row comprising comparison result values indicating results of comparisons with other input values, wherein a first region of the result matrix is to store a first set of bits comprising the N*(N−1)/2 comparison result values and a second region of the result matrix, opposite the first region, is to store a second set of bits comprising an inverse of the N*(N−1)/2 comparison result values; a parallel adder circuit to perform parallel additions of the bits in each row to generate N unique result values; and sorting circuitry to index into the N unique result values to return the sorted order.

Patent Agency Ranking