Accelerator for Gather-Update-Scatter Operations

    公开(公告)号:US20180165381A1

    公开(公告)日:2018-06-14

    申请号:US15376172

    申请日:2016-12-12

    CPC classification number: G06F17/30982 G06F9/30036

    Abstract: A processor may include a gather-update-scatter accelerator, and circuitry to direct an instruction to the accelerator for execution. The instruction may include a search index, an operation to be performed, and a scalar data value. The accelerator may include a content-associative memory (CAM) storing multiple entries, each of which stores a respective index key and a data value associated with the index key. The accelerator may include a CAM controller, including circuitry to select, based on the information in the instruction, one of the plurality of entries in the CAM on which to operate, an arithmetic logic unit (ALU), including circuitry to perform an arithmetic or logical operation on the selected entry, the operation being dependent on the information in the instruction, and circuitry to store a result of the operation in the selected entry in the CAM.

    Programmable memory prefetcher for prefetching multiple cache lines based on data in a prefetch engine control register

    公开(公告)号:US10452551B2

    公开(公告)日:2019-10-22

    申请号:US15376242

    申请日:2016-12-12

    Abstract: A processor may include a programmable memory prefetcher that includes a programmable hardware prefetch engine and a prefetch engine control register. The programmable memory prefetcher may include circuitry and may be configured to receive, during execution of an application, a first instruction for configuring the prefetch engine for prefetching multiple cache lines to be accessed in the future, at predictable locations, by the application; to store, in the prefetch engine control register, dependent on information in the first instruction, data representing an amount of prefetching to be performed, and data representing a stride distance between consecutive cache lines to be prefetched; to receive a second instruction for prefetching a single cache line whose location is identified in the second instruction; and to initiate, in response to receiving the second instruction, prefetching of multiple cache lines by the prefetch engine, to be performed in parallel with execution of the application and in accordance with the data stored in the prefetch engine control register. The prefetch engine control register may store multiple entries, each including an identifier of a given operation to prefetch multiple cache lines. An instruction may also be received to disable prefetching of multiple cache lines. The multiple cache lines may be prefetched from a last-level cache (LLC) to a mid-level cache.

    Apparatus and method for processing sparse data

    公开(公告)号:US10437562B2

    公开(公告)日:2019-10-08

    申请号:US15394968

    申请日:2016-12-30

    Abstract: An apparatus and method are described for designing an accelerator for processing sparse data. For example, one embodiment comprises a machine-readable medium having program code stored thereon which, when executed by a processor, causes the processor to perform the operations of: analyzing input graph program code and parameters associated with a target accelerator in view of an accelerator architecture template; responsively mapping the parameters onto the architecture template to implement customizations to the accelerator architecture template; and generating a hardware description representation of the target accelerator based on the determined mapping of the parameters to apply to the accelerator architecture template.

Patent Agency Ranking