TASK EXECUTION IN A SIMD PROCESSING UNIT WITH PARALLEL GROUPS OF PROCESSING LANES

    公开(公告)号:US20200265546A1

    公开(公告)日:2020-08-20

    申请号:US16867861

    申请日:2020-05-06

    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.

    Allocation of tiles to processing engines in a graphics processing system

    公开(公告)号:US10210651B2

    公开(公告)日:2019-02-19

    申请号:US16038564

    申请日:2018-07-18

    Abstract: A graphics processing system processes primitive fragments using a rendering space which is sub-divided into tiles. The graphics processing system comprises processing engines configured to apply texturing and/or shading to primitive fragments. The graphics processing system also comprises a cache system for storing graphics data for primitive fragments, the cache system including multiple cache subsystems. Each of the cache subsystems is coupled to a respective set of one or more processing engines. The graphics processing system also comprises a tile allocation unit which operates in one or more allocation modes to allocate tiles to processing engines. The allocation mode(s) include a spatial allocation mode in which groups of spatially adjacent tiles are allocated to the processing engines according to a spatial allocation scheme, which ensures that each of the groups of spatially adjacent tiles is allocated to a set of processing engines which are coupled to the same cache subsystem.

    Method and System for Multisample Antialiasing
    3.
    发明申请
    Method and System for Multisample Antialiasing 审中-公开
    多采样抗锯齿方法与系统

    公开(公告)号:US20160163099A1

    公开(公告)日:2016-06-09

    申请号:US15047466

    申请日:2016-02-18

    Abstract: A method and system for generating two or three dimensional computer graphics images using multisample antialiasing (MSAA) is provided, which enables memory bandwidth to be conserved. For each of one or more pixels it is determined whether all of a plurality of sample areas of that pixel are located within a particular primitive. For those pixels where it is determined that all the sample areas of that pixel are located within that primitive, a value is stored in a multisample memory for a smaller number of the sample areas of that pixel than the total number of the sample areas of that pixel and data is stored indicating that all the sample areas of that pixel are located within that primitive.

    Abstract translation: 提供了一种使用多采样抗锯齿(MSAA)生成二维或三维计算机图形图像的方法和系统,可以保存存储带宽。 对于一个或多个像素中的每一个,确定该像素的多个采样区域是否位于特定原语内。 对于确定该像素的所有样本区域位于该基元内的那些像素,对于该像素的较少数量的样本区域的值存储在多采样存储器中,该像素的采样区域的总数 存储像素和数据,指示该像素的所有样本区域位于该基元内。

    Task Execution in a SIMD Processing Unit with Parallel Groups of Processing Lanes
    4.
    发明申请
    Task Execution in a SIMD Processing Unit with Parallel Groups of Processing Lanes 有权
    具有并行加工车道组的SIMD处理单元中的任务执行

    公开(公告)号:US20150170324A1

    公开(公告)日:2015-06-18

    申请号:US14573397

    申请日:2014-12-17

    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.

    Abstract translation: SIMD处理单元处理多个任务,每个任务包括多达预定的最大数目的工作项。 任务的工作项目被安排用于对各个数据项执行公共的指令序列。 数据项被排列成块,其中一些块包括至少一个无效数据项。 与无效数据项相关的工作项是无效的工作项。 SIMD处理单元包括一组处理通道,其被配置为在多个处理循环中执行特定任务的工作项目的指令。 控制模块基于工作项的有效性将工作项目组合到任务中,使得特定任务的无效工作项在时间上跨越处理通道进行对齐。 以这种方式,可以减少由于无效的工作项造成的浪费的处理槽的数量。

    TASK EXECUTION IN A SIMD PROCESSING UNIT WITH PARALLEL GROUPS OF PROCESSING LANES

    公开(公告)号:US20250061536A1

    公开(公告)日:2025-02-20

    申请号:US18907801

    申请日:2024-10-07

    Abstract: A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.

    MULTI-OUTPUT DECODER FOR TEXTURE DECOMPRESSION

    公开(公告)号:US20240095975A1

    公开(公告)日:2024-03-21

    申请号:US18525715

    申请日:2023-11-30

    CPC classification number: G06T11/001 G06T9/00 G06T15/04 H04N19/436 H04N19/44

    Abstract: A decoder decodes a plurality of texels from a received block of texture data encoded according to the Adaptive Scalable Texture Compression (ASTC) format. A parameter decode unit decodes configuration data for the received block of texture data, a colour decode unit decodes colour endpoint data for the plurality of texels in dependence on the configuration data, a weight decode unit decodes interpolation weight data for each of the plurality of texels in dependence on the configuration data, and at least one interpolator unit calculates a colour value for each of the plurality of texels using the interpolation weight data for that texel and a pair of colour endpoints from the colour endpoint data. At least one of the parameter decode unit, colour decode unit and weight decode unit decodes intermediate data from the received block that is common to the decoding of a subset of texels of that block and uses that decoded data as part of the decoding of at least two of the plurality of texels.

    Decoder unit for texture decompression

    公开(公告)号:US11043020B2

    公开(公告)日:2021-06-22

    申请号:US16806178

    申请日:2020-03-02

    Abstract: A decoder unit is configured to decode a plurality of texels in accordance with a texel request, the plurality of texels being encoded across one or more blocks of encoded texture data each encoding a block of texels, and includes a first set of one or more decoders, each of the first set of decoders being configured to decode n texels from a single received block of encoded texture data; a second set of or more decoders, each of the second set of decoders being configured to decode p texels from a single received block of encoded texture data, where p

    Synchronisation of execution threads on a multi-threaded processor

    公开(公告)号:US10698690B2

    公开(公告)日:2020-06-30

    申请号:US16251620

    申请日:2019-01-18

    Inventor: Yoong Chert Foo

    Abstract: Method and apparatus are provided for synchronising execution of a plurality of threads on a multi-threaded processor. A program executed by a thread can have a number of synchronisation points corresponding to points where execution is to be synchronised with another thread. Execution of a thread is paused when it reaches a synchronisation point until at least one other thread with which it is intended to be synchronised reaches a corresponding synchronisation point. Execution is subsequently resumed. A control core maintains status data for threads and can cause a thread that is ready to run to use execution resources that were occupied by a thread that is waiting for a synchronisation event.

Patent Agency Ranking