-
公开(公告)号:US11561906B2
公开(公告)日:2023-01-24
申请号:US15839089
申请日:2017-12-12
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker , William E. Jones
IPC: G06F12/12 , G06F12/128 , G06F12/0882
Abstract: A processing system rinses, from a cache, those cache lines that share the same memory page as a cache line identified for eviction. A cache controller of the processing system identifies a cache line as scheduled for eviction. In response, the cache controller, identifies additional “dirty victim” cache lines (cache lines that have been modified at the cache and not yet written back to memory) that are associated with the same memory page, and writes each of the identified cache lines to the same memory page. By writing each of the dirty victim cache lines associated with the memory page to memory, the processing system reduces memory overhead and improves processing efficiency.
-
公开(公告)号:US11294810B2
公开(公告)日:2022-04-05
申请号:US15838809
申请日:2017-12-12
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker , William E. Jones
IPC: G06F12/08 , G06F12/0862 , G06F11/30 , G06F13/16 , G06F12/0811
Abstract: A processing system includes an interconnect fabric coupleable to a local memory and at least one compute cluster coupled to the interconnect fabric. The compute cluster includes a processor core and a cache hierarchy. The cache hierarchy has a plurality of caches and a throttle controller configured to throttle a rate of memory requests issuable by the processor core based on at least one of an access latency metric and a prefetch accuracy metric. The access latency metric represents an average access latency for memory requests for the processor core and the prefetch accuracy metric represents an accuracy of a prefetcher of a cache of the cache hierarchy.
-
公开(公告)号:US20170083435A1
公开(公告)日:2017-03-23
申请号:US15366952
申请日:2016-12-01
Applicant: Advanced Micro Devices, Inc.
Inventor: William L. Walker
IPC: G06F12/02
CPC classification number: G06F12/023 , G06F9/5016 , G06F12/084 , G06F2212/1008 , G06F2212/6042
Abstract: Apparatus and method embodiments for dynamically allocating cache space in a multi-threaded execution environment are disclosed. In some embodiments, a processor includes a cache shared by each of a plurality of processor cores and/or each of a plurality of threads executing on the processor. The processor further includes a cache allocation circuit configured to dynamically allocate space in the cache provided to each of the plurality of processor cores based on their respective usage patterns. The cache allocation unit may track cache usage by each of the processor cores/threads using subsets of usage bits and counters configured to update states of the usage bits. The cache allocation circuit may track the usage of cache space by the processor cores/threads and may allocate more space to those that exhibit more usage of the cache.
-
公开(公告)号:US12153926B2
公开(公告)日:2024-11-26
申请号:US18393657
申请日:2023-12-21
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: John Kalamatianos , Michael T. Clark , Marius Evers , William L. Walker , Paul Moyer , Jay Fleischman , Jagadish B. Kotra
Abstract: Processor-guided execution of offloaded instructions using fixed function operations is disclosed. Instructions designated for remote execution by a target device are received by a processor. Each instruction includes, as an operand, a target register in the target device. The target register may be an architected virtual register. For each of the plurality of instructions, the processor transmits an offload request in the order that the instructions are received. The offload request includes the instruction designated for remote execution. The target device may be, for example, a processing-in-memory device or an accelerator coupled to a memory.
-
公开(公告)号:US20240220409A1
公开(公告)日:2024-07-04
申请号:US18090249
申请日:2022-12-28
Applicant: Advanced Micro Devices, Inc.
Inventor: Vydhyanathan Kalyanasundharam , Alan D. Smith , Chintan S. Patel , William L. Walker
IPC: G06F12/0802
CPC classification number: G06F12/0802 , G06F2212/1024
Abstract: The disclosed computer-implemented method includes partitioning a cache structure into a plurality of cache partitions designated by a plurality of cache types, forwarding a memory request to a cache partition corresponding to a target cache type of the memory request, and performing, using the cache partition, the memory request. Various other methods, systems, and computer-readable media are also disclosed.
-
公开(公告)号:US10282295B1
公开(公告)日:2019-05-07
申请号:US15825880
申请日:2017-11-29
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker , Michael W. Boyer , Yasuko Eckert , Gabriel H. Loh
IPC: G06F12/08 , G06F12/0817 , G06F12/0831 , G06F12/0811 , G06F12/128
Abstract: A method includes monitoring, at a cache coherence directory, states of cachelines stored in a cache hierarchy of a data processing system using a plurality of entries of the cache coherence directory. Each entry of the cache coherence directory is associated with a corresponding cache page of a plurality of cache pages, and each cache page representing a corresponding set of contiguous cachelines. The method further includes selectively evicting cachelines from a first cache of the cache hierarchy based on cacheline utilization densities of cache pages represented by the corresponding entries of the plurality of entries of the cache coherence directory.
-
公开(公告)号:US20180067856A1
公开(公告)日:2018-03-08
申请号:US15256950
申请日:2016-09-06
Applicant: Advanced Micro Devices, Inc.
Inventor: William L. Walker
IPC: G06F12/0811 , G06F12/128 , G06F12/0831
CPC classification number: G06F12/128 , G06F2212/283 , G06F2212/69 , G06F2212/70
Abstract: A system for managing cache utilization includes a processor core, a lower-level cache, and a higher-level cache. In response to activating the higher-level cache, the system counts lower-level cache victims evicted from the lower-level cache. While a count of the lower-level cache victims is not greater than a threshold number, the system transfers each lower-level cache victim to a system memory without storing the lower-level cache victim to the higher-level cache. When the count of the lower-level cache victims is greater than the threshold number, the system writes each lower-level cache victim to the higher-level cache. In this manner, if the higher-level cache is deactivated before the threshold number of lower-level cache victims is reached, the higher-level cache is empty and thus may be deactivated without flushing.
-
公开(公告)号:US11847062B2
公开(公告)日:2023-12-19
申请号:US17552703
申请日:2021-12-16
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Tarun Nakra , Jay Fleischman , Gautam Tarasingh Hazari , Akhil Arunkumar , William L. Walker , Gabriel H. Loh , John Kalamatianos , Marko Scrbak
IPC: G06F12/0897 , G06F12/0891
CPC classification number: G06F12/0897 , G06F12/0891 , G06F2212/1028
Abstract: In response to eviction of a first clean data block from an intermediate level of cache in a multi-cache hierarchy of a processing system, a cache controller accesses an address of the first clean data block. The controller initiates a fetch of the first clean data block from a system memory into a last-level cache using the accessed address.
-
公开(公告)号:US11829196B2
公开(公告)日:2023-11-28
申请号:US16659978
申请日:2019-10-22
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker
IPC: G06F1/08 , G06F1/06 , G01R31/28 , G01R31/3185
CPC classification number: G06F1/08 , G01R31/2896 , G01R31/318552 , G06F1/06
Abstract: An integrated circuit (IC) device includes a ring transport having a plurality of nodes and a wire interconnect coupling the plurality of nodes in a ring. The wire interconnect including a wire to transmit clock wake signals around the ring transport in advance of data signaling representing a data packet. Each node is to switch from a clock gated state to a clocked state responsive to receiving a clock wake signal. The ring transport further includes a sleep controller coupled to a select node of the plurality of nodes. The sleep controller is to configure the select node into a clock suppression state for a specified duration responsive to identifying an idle condition on the ring transport via monitoring of the wire. While in the clock suppression state the node suppresses further transmission of any clock wake signals received at the select node.
-
公开(公告)号:US11704248B2
公开(公告)日:2023-07-18
申请号:US17091993
申请日:2020-11-06
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: William L. Walker , Michael L. Golden , Marius Evers
IPC: G06F12/0862 , G06F12/128 , G06F12/1027 , G06F1/3234 , G06F12/0815 , G06F12/1009 , G06F12/0811
CPC classification number: G06F12/0862 , G06F1/3275 , G06F12/0811 , G06F12/0815 , G06F12/1009 , G06F12/1027 , G06F12/128 , G06F2212/602 , G06F2212/65 , G06F2212/68
Abstract: A processor core associated with a first cache initiates entry into a powered-down state. In response, information representing a set of entries of the first cache are stored in a retention region that receives a retention voltage while the processor core is in a powered-down state. Information indicating one or more invalidated entries of the set of entries is also stored in the retention region. In response to the processor core initiating exit from the powered-down state, entries of the first cache are restored using the stored information representing the entries and the stored information indicating the at least one invalidated entry.
-
-
-
-
-
-
-
-
-