Aggregation and Scheduling of Accelerator Executable Tasks

    公开(公告)号:US20240385872A1

    公开(公告)日:2024-11-21

    申请号:US18198981

    申请日:2023-05-18

    Abstract: In accordance with the described techniques for aggregation and scheduling of accelerator executable tasks, an accelerator device includes a processing element array and a command processor to receive a plurality of fibers each including multiple tasks and dependencies between the multiple tasks. The command processor places a first fiber in a sleep pool based on a first task within the first fiber having an unresolved dependency, and the command processor further places a second fiber in a ready pool based on a second task within the second fiber having a resolved dependency. Based on the second fiber being in the ready pool, the command processor launches the second task to be executed by the processing element array.

    SYSTEM AGNOSTIC AUTONOMOUS SYSTEM STATE MANAGEMENT

    公开(公告)号:US20240370077A1

    公开(公告)日:2024-11-07

    申请号:US18312522

    申请日:2023-05-04

    Abstract: A computing device is provided which comprises memory and a processor in communication with the memory. The processor is configured to autonomously acquire input parameter values, comprising one of monitored device input parameter values from a component of the computing device and monitored user input parameter values. The processor is also configured to select, from a plurality of modes of operation, a mode of operation comprising parameter settings which are determined based on the acquired input parameter values, each of the plurality of modes of operation comprising different parameter settings configured to control the computing device to operate at a different level of performance. The processor is also configured to control operation of the computing device by tuning the parameter settings of the computing device according to the selected mode of operation comprising the determined parameter settings.

    Adaptive scheduling of memory and processing-in-memory requests

    公开(公告)号:US12131026B2

    公开(公告)日:2024-10-29

    申请号:US18090916

    申请日:2022-12-29

    CPC classification number: G06F3/061 G06F3/0659 G06F3/0673

    Abstract: Adaptive scheduling of memory requests and processing-in-memory requests is described. In accordance with the described techniques, a memory controller receives a plurality of processing-in-memory requests and a plurality of non-processing-in-memory requests from a host. The memory controller schedules an order of execution for the plurality of processing-in-memory requests and the plurality of non-processing-in-memory requests based at least in part on a processing-in-memory request stall threshold and a non-processing-in-memory request stall threshold. In response to a system switching (e.g., from executing processing-in-memory requests to executing non-processing-in-memory requests or from executing non-processing-in-memory requests to executing processing-in-memory requests), the memory controller modifies the processing-in-memory request stall threshold and the non-processing-in-memory request stall threshold. The memory controller continues scheduling an order of execution for subsequent requests received from the host using the modified stall thresholds.

    DYNAMIC ADJUSTMENT OF MEMORY OPERATING FREQUENCY TO AVOID RF INTERFERENCE WITH WIFI

    公开(公告)号:US20240334340A1

    公开(公告)日:2024-10-03

    申请号:US18128805

    申请日:2023-03-30

    CPC classification number: H04W52/029 H04W52/0274

    Abstract: An apparatus and method for efficiently performing power management for increasing reliable wireless signal transfer performed by mobile computing devices. In various implementations, a computing system includes a network interface and multiple components for processing tasks. The network interface sends, to at least a given component of the multiple components, an indication specifying the corresponding operating frequency ranges used by one or more radio modules used for wireless communication with an access point. The given component determines whether an operating clock frequency of the given component overlaps any of the received operating frequency ranges and associated harmonic frequencies. If so, then the given component changes the operating clock frequency to a frequency that does not overlap any of the received operating frequency ranges and associated harmonic frequencies.

    WAVE LEVEL MATRIX MULTIPLY INSTRUCTIONS
    89.
    发明公开

    公开(公告)号:US20240329998A1

    公开(公告)日:2024-10-03

    申请号:US18619392

    申请日:2024-03-28

    CPC classification number: G06F9/3802 G06F9/3001 G06F9/30098 G06F9/3867

    Abstract: An apparatus and method for efficiently processing multiplication and accumulate operations for matrices in applications. In various implementations, a computing system includes a parallel data processing circuit and a memory. The memory stores the instructions (or translated commands) of a parallel data application. The circuitry of the parallel data processing circuit performs a matrix multiplication operation using source operands accessed only once from a vector register file and multiple instantiations of a vector processing circuit capable of performing multiple matrix multiplication operations corresponding to multiple different types of instructions. The multiplier circuit and the adder circuit of the vector processing circuit perform each of the fused multiply add (FMA) operation and the dot product (inner product) operation without independent, dedicated execution pipelines with one execution pipeline for the FMA operation and the other separate execution pipeline for the dot product operation.

    Global addressing for switch fabric

    公开(公告)号:US12105952B2

    公开(公告)日:2024-10-01

    申请号:US17957469

    申请日:2022-09-30

    CPC classification number: G06F3/0607 G06F3/0629 G06F3/067

    Abstract: Systems, methods, and techniques are provided for a fabric addressable memory. A memory access request is received from a host computing device attached via one edge port of one or more interconnect switches, the memory access request directed to a destination segment of a physical fabric memory block that is allocated in local physical memory of the host computing device. The edge port accesses a stored mapping between segments of the physical fabric memory block and one or more destination port identifiers that are each associated with a respective edge port of the fabric addressable memory. The memory access request is routed by the one edge port to a destination edge port based on the stored mapping.

Patent Agency Ranking