Data compression in processor caches
    43.
    发明授权
    Data compression in processor caches 有权
    处理器缓存中的数据压缩

    公开(公告)号:US09251096B2

    公开(公告)日:2016-02-02

    申请号:US14036673

    申请日:2013-09-25

    CPC classification number: G06F12/126 G06F12/0895

    Abstract: In an embodiment, a processor includes a cache data array including a plurality of physical ways, each physical way to store a baseline way and a victim way; a cache tag array including a plurality of tag groups, each tag group associated with a particular physical way and including a first tag associated with the baseline way stored in the particular physical way, and a second tag associated with the victim way stored in the particular physical way; and cache control logic to: select a first baseline way based on a replacement policy, select a first victim way based on an available capacity of a first physical way including the first victim way, and move a first data element from the first baseline way to the first victim way. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,处理器包括高速缓存数据阵列,其包括多个物理方式,每种物理方式来存储基线方式和受害方式; 包括多个标签组的缓存标签阵列,与特定物理方式相关联的每个标签组,并且包括与以特定物理方式存储的基线方式相关联的第一标签,以及与存储在特定物理方式中的受害方式相关联的第二标签 物理方式 以及高速缓存控制逻辑,以:基于替换策略选择第一基线方式,基于包括所述第一受害者方式的第一物理方式的可用容量选择第一受害者方式,并将第一数据元素从所述第一基线方式移动到 第一个受害者的方式。 描述和要求保护其他实施例。

    Segmented branch target buffer based on branch instruction type

    公开(公告)号:US12190114B2

    公开(公告)日:2025-01-07

    申请号:US17130016

    申请日:2020-12-22

    Abstract: In one embodiment, a processor includes a branch predictor to predict whether a branch instruction is to be taken and a branch target buffer (BTB) coupled to the branch predictor. The branch target buffer may be segmented into a first cache portion and a second cache portion, where, in response to an indication that the branch is to be taken, the BTB is to access an entry in one of the first cache portion and the second cache portion based at least in part on a type of the branch instruction, an occurrence frequency of the branch instruction, and spatial information regarding a distance between a target address of a target of the branch instruction and an address of the branch instruction. Other embodiments are described and claimed.

    Dynamic shared cache partition for workload with large code footprint

    公开(公告)号:US12066945B2

    公开(公告)日:2024-08-20

    申请号:US17130698

    申请日:2020-12-22

    Abstract: An embodiment of an integrated circuit may comprise a core, a first level core cache memory coupled to the core, a shared core cache memory coupled to the core, a first cache controller coupled to the core and communicatively coupled to the first level core cache memory, a second cache controller coupled to the core and communicatively coupled to the shared core cache memory, and circuitry coupled to the core and communicatively coupled to the first cache controller and the second cache controller to determine if a workload has a large code footprint, and, if so determined, partition N ways of the shared core cache memory into first and second chunks of ways with the first chunk of M ways reserved for code cache lines from the workload and the second chunk of N minus M ways reserved for data cache lines from the workload, where N and M are positive integer values and N minus M is greater than zero. Other embodiments are disclosed and claimed.

    Apparatus and method for hardware-based memoization of function calls to reduce instruction execution

    公开(公告)号:US12020033B2

    公开(公告)日:2024-06-25

    申请号:US17133899

    申请日:2020-12-24

    CPC classification number: G06F9/3836 G06F9/223 G06F9/3838

    Abstract: Apparatus and method for memorizing repeat function calls are described herein. An apparatus embodiment includes: uop buffer circuitry to identify a function for memorization based on retiring micro-operations (uops) from a processing pipeline; memorization retirement circuitry to generate a signature of the function which includes input and output data of the function; a memorization data structure to store the signature; and predictor circuitry to detect an instance of the function to be executed by the processing pipeline and to responsively exclude a first subset of uops associated with the instance from execution when a confidence level associated with the function is above a threshold. One or more instructions that are data-dependent on execution of the instance is then provided with the output data of the function from the memorization data structure.

    Data relocation for inline metadata

    公开(公告)号:US11972126B2

    公开(公告)日:2024-04-30

    申请号:US17472272

    申请日:2021-09-10

    Abstract: Technologies disclosed herein provide one example of a system that includes processor circuitry to be communicatively coupled to a memory circuitry. The processor circuitry is to receive a memory access request corresponding to an application for access to an address range in a memory allocation of the memory circuitry and to locate a metadata region within the memory allocation. The processor circuitry is also to, in response to a determination that the address range includes at least a portion of the metadata region, obtain first metadata stored in the metadata region, use the first metadata to determine an alternate memory address in a relocation region, and read, at the alternate memory address, displaced data from the portion of the metadata region included in the address range of the memory allocation. The address range includes one or more bytes of an expected allocation region of the memory allocation.

    Apparatuses, methods, and systems for a duplication resistant on-die irregular data prefetcher

    公开(公告)号:US11847053B2

    公开(公告)日:2023-12-19

    申请号:US16833419

    申请日:2020-03-27

    Abstract: Systems, methods, and apparatuses relating to circuitry to implement a duplication resistant on-die irregular data prefetcher are described. In one embodiment, a hardware processor includes a cache to store a plurality of cache lines of data, a processing element to execute instructions to generate memory requests, and a prefetch circuit to track a first set of cache lines, requested to be accessed by the memory requests, that repeat in a first number of executed instructions, track a second set of cache lines, requested to be accessed by the memory requests, that repeat in a second, larger number of executed instructions, detect a memory request from an instruction for a cache line from the cache, determine if the cache line is within the first set of cache lines or the second set of cache lines, update first correlation data for the cache line when the cache line is within the first set of cache lines, and update second correlation data for the cache line when the cache line is within the second set of cache lines.

Patent Agency Ranking