LARGE NUMBER INTEGER ADDITION USING VECTOR ACCUMULATION

    公开(公告)号:US20240319964A1

    公开(公告)日:2024-09-26

    申请号:US18126107

    申请日:2023-03-24

    CPC classification number: G06F7/503

    Abstract: A processor includes one or more processor cores configured to perform accumulate top (ACCT) and accumulate bottom (ACCB) instructions. To perform such instructions, at least one processor core of the processor includes an ACCT data path that adds a first portion of a block of data to a first lane of a set of lanes of a top accumulator and adds a carry-out bit to a second lane of the set of lanes of the top accumulator. Further, the at least one processor core includes an ACCB data path that adds a second portion of the block of data to a first lane of a set of lanes of a bottom accumulator and adds a carry-out bit to a second lane of the set of lanes of the bottom accumulator.

    LATENCY REDUCTION FOR TRANSITIONS BETWEEN ACTIVE STATE AND SLEEP STATE OF AN INTEGRATED CIRCUIT

    公开(公告)号:US20240319781A1

    公开(公告)日:2024-09-26

    申请号:US18189993

    申请日:2023-03-24

    CPC classification number: G06F1/3275 G06F1/3228 G06F1/3287

    Abstract: An apparatus and method for efficient power management of multiple integrated circuits. In various implementations, a computing system includes an integrated circuit with a security processor. The security processor determines the integrated circuit transitions to an active state from a sleep state that is not intended to maintain configuration information to return to the active state without restarting an operating system. In the sleep state, multiple components of the integrated circuit have a power supply reference level turned off, which provides low power consumption for the integrated circuit. The security processor performs the bootup operation using information stored in persistent on-chip memory. By not using information stored in off-chip memory, the security processor reduces the latency of the transition. The persistent on-chip memory utilizes synchronous random-access memory that receives a standby power supply reference level that continually supplies a voltage magnitude by not being turned off.

    Multi-kernel wavefront scheduler
    96.
    发明授权

    公开(公告)号:US12099867B2

    公开(公告)日:2024-09-24

    申请号:US15993061

    申请日:2018-05-30

    CPC classification number: G06F9/4881

    Abstract: Systems, apparatuses, and methods for implementing a multi-kernel wavefront scheduler are disclosed. A system includes at least a parallel processor coupled to one or more memories, wherein the parallel processor includes a command processor and a plurality of compute units. The command processor launches multiple kernels for execution on the compute units. Each compute unit includes a multi-level scheduler for scheduling wavefronts from multiple kernels for execution on its execution units. A first level scheduler creates scheduling groups by grouping together wavefronts based on the priority of their kernels. Accordingly, wavefronts from kernels with the same priority are grouped together in the same scheduling group by the first level scheduler. Next, the first level scheduler selects, from a plurality of scheduling groups, the highest priority scheduling group for execution. Then, a second level scheduler schedules wavefronts for execution from the scheduling group selected by the first level scheduler.

    SOFTWARE-DEFINED COMPUTE UNIT RESOURCE ALLOCATION MODE

    公开(公告)号:US20240311199A1

    公开(公告)日:2024-09-19

    申请号:US18120646

    申请日:2023-03-13

    CPC classification number: G06F9/505 G06F9/522

    Abstract: A program code executing on a processing system includes one or more instructions each identifying a workload that includes a plurality of waves and each identifying resource allocations for the plurality of waves of the workgroup. In response to receiving an instruction identifying a workload and resource allocations for the plurality of waves of the workgroup, a processor allocates a first set of processing resources to a compute unit of the processor based on the resource allocations for the plurality of waves. The compute unit then performs operations for the workgroup using the allocated set of processing resources.

Patent Agency Ranking