LOW LATENCY REMOTING TO ACCELERATORS

    公开(公告)号:US20230048915A1

    公开(公告)日:2023-02-16

    申请号:US17968589

    申请日:2022-10-18

    Abstract: A method of offloading performance of a workload includes receiving, on a first computing system acting as an initiator, a first function call from a caller, the first function call to be executed by an accelerator on a second computing system acting as a target, the first computing system coupled to the second computing system by a network; determining a type of the first function call; and generating a list of parameter values of the first function call.

    Technologies for accelerator fabric protocol multipathing

    公开(公告)号:US11301407B2

    公开(公告)日:2022-04-12

    申请号:US16242928

    申请日:2019-01-08

    Abstract: Technologies for accessing pooled accelerator resources over a network fabric are disclosed. In disclosed embodiments, an application hosted by a computing platform accesses remote accelerator resources over a network fabric using protocol multipathing mechanisms. A communication session is established with the remote accelerator resources. The communication session comprises at least two connections. The at least two connections at least include a first connection having or utilizing a first transport layer and a second connection having or utilizing a second transport layer that is different than the first transport layer. Other embodiments may be disclosed and/or claimed.

    DISAGGREGATED COMPUTING FOR DISTRIBUTED CONFIDENTIAL COMPUTING ENVIRONMENT

    公开(公告)号:US20220100582A1

    公开(公告)日:2022-03-31

    申请号:US17531005

    申请日:2021-11-19

    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes a processor executing a trusted execution environment (TEE) comprising a field-programmable gate array (FPGA) driver to interface with an FPGA device that is remote to the apparatus; and a remote memory-mapped input/output (MMIO) driver to expose the FPGA device as a legacy device to the FPGA driver, wherein the processor to utilize the remote MMIO driver to: enumerate the FPGA device using FPGA enumeration data provided by a remote management controller of the FPGA device, the FPGA enumeration data comprising a configuration space and device details; load function drivers for the FPGA device in the TEE; create corresponding device files in the TEE based on the FPGA enumeration data; and handle remote MMIO reads and writes to the FPGA device via a network transport protocol.

    Technologies for providing adaptive power management in an accelerator sled

    公开(公告)号:US11269395B2

    公开(公告)日:2022-03-08

    申请号:US16394646

    申请日:2019-04-25

    Abstract: Technologies for providing adaptive power management in an accelerator sled include an accelerator sled having circuitry to determine, based on (i) a total power budget for the accelerator sled, (ii) service level agreement (SLA) data indicative of a target performance of a kernel, and (iii) profile data indicative of a performance of the kernel as a function of a power utilization of the kernel, a power utilization limit for the kernel to be executed by an accelerator device on the accelerator sled. Additionally, the circuitry is to allocate the determined power utilization limit to the kernel and execute the kernel under the allocated power utilization limit.

    Networked shuffle storage
    86.
    发明授权

    公开(公告)号:US11194522B2

    公开(公告)日:2021-12-07

    申请号:US16634436

    申请日:2017-08-16

    Abstract: Apparatuses for computing are disclosed herein. An apparatus may include a set of data reduction modules to perform data reduction operations on sets of (key, value) data pairs to reduce an amount of values associated with a shared key, wherein the (key, value) data pairs are stored in a plurality of queues located in a plurality of solid state drives remote from the apparatus. The apparatus may further include a memory access module, communicably coupled to the set of data reduction modules, to directly transfer individual ones of the sets of queued (key, value) data pairs from the plurality of remote solid state drives through remote random access of the solid state drives, via a network, without using intermediate staging storage. Other embodiments may be disclosed or claimed.

    DISAGGREGATED COMPUTING FOR DISTRIBUTED CONFIDENTIAL COMPUTING ENVIRONMENT

    公开(公告)号:US20210117246A1

    公开(公告)日:2021-04-22

    申请号:US17133066

    申请日:2020-12-23

    Abstract: An apparatus to facilitate disaggregated computing for a distributed confidential computing environment is disclosed. The apparatus includes one or more processors to facilitate receiving a manifest corresponding to graph nodes representing regions of memory of a remote client machine, the graph nodes corresponding to a command buffer and to associated data structures and kernels of the command buffer used to initialize a hardware accelerator and execute the kernels, and the manifest indicating a destination memory location of each of the graph nodes and dependencies of each of the graph nodes; identifying, based on the manifest, the command buffer and the associated data structures to copy to the host memory; identifying, based on the manifest, the kernels to copy to local memory of the hardware accelerator; and patching addresses in the command buffer copied to the host memory with updated addresses of corresponding locations in the host memory.

    TECHNOLOGIES FOR LOCKLESS, SCALABLE, AND ADAPTIVE STORAGE QUALITY OF SERVICE

    公开(公告)号:US20200371689A1

    公开(公告)日:2020-11-26

    申请号:US16416696

    申请日:2019-05-20

    Abstract: Technologies for quality of service (QoS) management include a computing device having a physical storage volume and multiple processor cores. A management thread reads I/O counters that are each associated with a logical volume and a processor core. The logical volumes are backed by the physical storage volume. The management thread configures stop bits as a function of the I/O counters and multiple QoS parameters. Each stop bit is associated with a logical volume and a processor core. The QoS parameters include minimum guaranteed bandwidth and optional maximum allowed bandwidth for each logical volume. A worker thread reads the stop bit associated with a logical volume and a processor core, accesses the logical volume if the stop bit is not set, and updates the I/O counter associated with the logical volume and the processor core in response to accessing the logical volume. Other embodiments are described and claimed.

    NETWORKED SHUFFLE STORAGE
    90.
    发明申请

    公开(公告)号:US20200210114A1

    公开(公告)日:2020-07-02

    申请号:US16634436

    申请日:2017-08-16

    Abstract: Apparatuses for computing are disclosed herein. An apparatus may include a set of data reduction modules to perform data reduction operations on sets of (key, value) data pairs to reduce an amount of values associated with a shared key, wherein the (key, value) data pairs are stored in a plurality of queues located in a plurality of solid state drives remote from the apparatus. The apparatus may further include a memory access module, communicably coupled to the set of data reduction modules, to directly transfer individual ones of the sets of queued (key, value) data pairs from the plurality of remote solid state drives through remote random access of the solid state drives, via a network, without using intermediate staging storage. Other embodiments may be disclosed or claimed.

Patent Agency Ranking