HIGHLY SCALABLE ACCELERATOR
    2.
    发明公开

    公开(公告)号:US20230251986A1

    公开(公告)日:2023-08-10

    申请号:US18296875

    申请日:2023-04-06

    CPC classification number: G06F13/364 G06F9/5027 G06F13/24

    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.

    ARCHITECTURAL INTERFACES FOR GUEST SOFTWARE TO SUBMIT COMMANDS TO AN ADDRESS TRANSLATION CACHE IN XPUs

    公开(公告)号:US20230013023A1

    公开(公告)日:2023-01-19

    申请号:US17951024

    申请日:2022-09-22

    Abstract: In one embodiment, an apparatus includes a processor comprising an address translation cache (ATC); a shared work queue (SWQ) associated with the ATC, and a port to couple to a host processor over a Peripheral Component Interconnect Express (PCIe)-based link. The apparatus also includes circuitry to receive address translation information from a memory management unit of the host processor that includes virtual memory address to physical memory address translations, store the address translation information in the ATC, receive an invalidation command from the host processor indicating an invalidation of address translation information stored in the ATC, modify the address translation information in the ATC based on the invalidation command, and store completion data in a memory location indicated by the invalidation command.

    Address translation for scalable linked devices

    公开(公告)号:US10969992B2

    公开(公告)日:2021-04-06

    申请号:US16236473

    申请日:2018-12-29

    Abstract: Systems, methods, and devices can include a processing engine implemented at least partially in hardware, the processing engine to process memory transactions; a memory element to index physical address and virtual address translations; and a memory controller logic implemented at least partially in hardware, the memory controller logic to receive an index from the processing engine, the index corresponding to a physical address and a virtual address; identify a physical address based on the received index; and provide the physical address to the processing engine. The processing engine can use the physical address for memory transactions in response to a streaming workload job request.

    Technologies for application validation in persistent memory systems
    5.
    发明授权
    Technologies for application validation in persistent memory systems 有权
    持久性存储器系统中的应用验证技术

    公开(公告)号:US09535820B2

    公开(公告)日:2017-01-03

    申请号:US14670965

    申请日:2015-03-27

    CPC classification number: G06F11/3688 G06F11/3648

    Abstract: Technologies for software testing include a computing device having persistent memory that includes a platform simulator and an application or other code module to be tested. The computing device generates a checkpoint for the application at a test location using the platform simulator. The computing device executes the application from the test location to an end location and traces all writes to persistent memory using the platform simulator. The computing device generates permutations of persistent memory writes that are allowed by the hardware specification of the computing device simulated by the platform simulator. The computing device replays each permutation from the checkpoint, simulates a power failure, and then invokes a user-defined test function using the platform simulator. The computing device may test different permutations of memory writes until the application's use of persistent memory is validated. Other embodiments are described and claimed.

    Abstract translation: 用于软件测试的技术包括具有持久存储器的计算设备,其包括平台模拟器和要测试的应用或其他代码模块。 计算设备使用平台模拟器在测试位置生成应用程序的检查点。 计算设备从测试位置执行应用程序到终端位置,并使用平台模拟器跟踪对持久存储器的所有写入。 计算设备产生由平台模拟器模拟的计算设备的硬件规范允许的持久存储器写入的排列。 计算设备从检查点重播每个置换,模拟电源故障,然后使用平台模拟器调用用户定义的测试功能。 计算设备可以测试存储器写入的不同排列,直到应用程序使用永久存储器被验证为止。 描述和要求保护其他实施例。

    PROCESS ADDRESS SPACE IDENTIFIER VIRTUALIZATION USING HARDWARE PAGING HINT

    公开(公告)号:US20210271481A1

    公开(公告)日:2021-09-02

    申请号:US17253053

    申请日:2018-12-21

    Abstract: Process address space identifier virtualization uses hardware paging hint. The processing device (100) comprising: a processing core (110); and a translation circuit coupled to the processing core, the translation circuit to: receive a workload instruction from a guest application being executed by the processing device, the workload instruction comprising an untranslated guest process address space identifier (gPASID), a workload for an input/output (I/O) target device, and an identifier of a submission register on the I/O target device (410), access a paging data structure (PDS) associated with the guest application to retrieve a page table entry corresponding to the gPASID and the identifier of the submission register (420), determine a value of an I/O hint bit of the page table entry corresponding to the gPASID and the identifier of the submission register (430), responsive to determining that the I/O hint bit is enabled, keep the untranslated gPASID in the workload instruction (440), and provide the workload instruction to a work queue of the I/O target device (450)

    Highly scalable accelerator
    8.
    发明授权

    公开(公告)号:US11106613B2

    公开(公告)日:2021-08-31

    申请号:US15940128

    申请日:2018-03-29

    Abstract: Embodiments of apparatuses, methods, and systems for highly scalable accelerators are described. In an embodiment, an apparatus includes an interface to receive a plurality of work requests from a plurality of clients and a plurality of engines to perform the plurality of work requests. The work requests are to be dispatched to the plurality of engines from a plurality of work queues. The work queues are to store a work descriptor per work request. Each work descriptor is to include all information needed to perform a corresponding work request.

    TECHNOLOGIES FOR OFFLOAD DEVICE FETCHING OF ADDRESS TRANSLATIONS

    公开(公告)号:US20210149815A1

    公开(公告)日:2021-05-20

    申请号:US17129496

    申请日:2020-12-21

    Abstract: Techniques for offload device address translation fetching are disclosed. In the illustrative embodiment, a processor of a compute device sends a translation fetch descriptor to an offload device before sending a corresponding work descriptor to the offload device. The offload device can request translations for virtual memory address and cache the corresponding physical addresses for later use. While the offload device is fetching virtual address translations, the compute device can perform other tasks before sending the corresponding work descriptor, including operations that modify the contents of the memory addresses whose translation are being cached. Even if the offload device does not cache the translations, the fetching can warm up the cache in a translation lookaside buffer. Such an approach can reduce the latency overhead that the offload device may otherwise incur in sending memory address translation requests that would be required to execute the work descriptor.

Patent Agency Ranking