DEVICE, SYSTEM AND METHOD FOR SELECTIVELY DROPPING SOFTWARE PREFETCH INSTRUCTIONS

    公开(公告)号:US20220197821A1

    公开(公告)日:2022-06-23

    申请号:US17133414

    申请日:2020-12-23

    Abstract: Techniques and mechanisms for providing information to determine whether a software prefetch instruction is to be executed. In an embodiment, one or more entries of a translation lookaside buffer (TLB) each include a respective value which indicates whether, according to one or more criteria, corresponding data has been sufficiently utilized. Insufficiently utilized data is indicated in a TLB entry with an identifier of an executed instruction to prefetch the corresponding data. An eviction of the TLB entry results in the creation of an entry in a registry of prefetch instructions. The entry in the registry includes the identifier of the executed prefetch instruction, and a value indicating a number of times that one or more future prefetch instructions are to be dropped. In another embodiment, execution of a subsequent prefetch instruction—which also corresponds to the identifier—is prevented based on the registry entry.

    APPARATUS, METHOD, AND SYSTEM FOR ENHANCED DATA PREFETCHING BASED ON NON-UNIFORM MEMORY ACCESS (NUMA) CHARACTERISTICS

    公开(公告)号:US20200233806A1

    公开(公告)日:2020-07-23

    申请号:US16837833

    申请日:2020-04-01

    Abstract: Apparatus, method, and system for enhancing data prefetching based on non-uniform memory access (NUMA) characteristics are described herein. An apparatus embodiment includes a system memory, a cache, and a prefetcher. The system memory includes multiple memory regions, at least some of which are associated with different NUMA characteristic (access latency, bandwidth, etc.) than others. Each region is associated with its own set of prefetch parameters that are set in accordance to their respective NUMA characteristics. The prefetcher monitors data accesses to the cache and generates one or more prefetch requests to fetch data from the system memory to the cache based on the monitored data accesses and the set of prefetch parameters associated with the memory region from which data is to be fetched. The set of prefetcher parameters may include prefetch distance, training-to-stable threshold, and throttle threshold.

    APPARATUS, METHOD, AND SYSTEM FOR ENHANCED DATA PREFETCHING BASED ON NON-UNIFORM MEMORY ACCESS (NUMA) CHARACTERISTICS

    公开(公告)号:US20200004684A1

    公开(公告)日:2020-01-02

    申请号:US16024527

    申请日:2018-06-29

    Abstract: Apparatus, method, and system for enhancing data prefetching based on non-uniform memory access (NUMA) characteristics are described herein. An apparatus embodiment includes a system memory, a cache, and a prefetcher. The system memory includes multiple memory regions, at least some of which are associated with different NUMA characteristic (access latency, bandwidth, etc.) than others. Each region is associated with its own set of prefetch parameters that are set in accordance to their respective NUMA characteristics. The prefetcher monitors data accesses to the cache and generates one or more prefetch requests to fetch data from the system memory to the cache based on the monitored data accesses and the set of prefetch parameters associated with the memory region from which data is to be fetched. The set of prefetcher parameters may include prefetch distance, training-to-stable threshold, and throttle threshold.

    Device, system and method for selectively dropping software prefetch instructions

    公开(公告)号:US12111772B2

    公开(公告)日:2024-10-08

    申请号:US17133414

    申请日:2020-12-23

    Abstract: Techniques and mechanisms for providing information to determine whether a software prefetch instruction is to be executed. In an embodiment, one or more entries of a translation lookaside buffer (TLB) each include a respective value which indicates whether, according to one or more criteria, corresponding data has been sufficiently utilized. Insufficiently utilized data is indicated in a TLB entry with an identifier of an executed instruction to prefetch the corresponding data. An eviction of the TLB entry results in the creation of an entry in a registry of prefetch instructions. The entry in the registry includes the identifier of the executed prefetch instruction, and a value indicating a number of times that one or more future prefetch instructions are to be dropped. In another embodiment, execution of a subsequent prefetch instruction—which also corresponds to the identifier—is prevented based on the registry entry.

    VISUALIZING MEMORY BANDWIDTH UTILIZATION USING MEMORY BANDWIDTH STACK

    公开(公告)号:US20220283719A1

    公开(公告)日:2022-09-08

    申请号:US17824413

    申请日:2022-05-25

    Abstract: An apparatus to facilitate generating a memory bandwidth stack for visualizing memory bandwidth utilization is disclosed. The apparatus includes processors to receive data corresponding to a memory cycle occurring during a total execution time of an application executed by the one or more processors; for the memory cycle, assign the memory cycle to a component of a bandwidth stack based on analysis of the data and in accordance with a prioritization scheme; for the component, determine a portion of the bandwidth stack to account to the component based at least in part on the assignment of the memory cycle to the component; and generate the bandwidth stack by at least representing the portion accounted to the component in the bandwidth stack.

Patent Agency Ranking