-
公开(公告)号:US20210390057A1
公开(公告)日:2021-12-16
申请号:US17459100
申请日:2021-08-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt
IPC: G06F12/0891 , G06F9/30 , G06F9/38 , G06F12/0811
Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
-
公开(公告)号:US11163688B2
公开(公告)日:2021-11-02
申请号:US16580139
申请日:2019-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Jay Fleischman
IPC: G06F12/0888 , G06F12/0811 , G06F12/0846
Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.
-
公开(公告)号:US20210089462A1
公开(公告)日:2021-03-25
申请号:US16580139
申请日:2019-09-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Jay Fleischman
IPC: G06F12/0888 , G06F12/0846 , G06F12/0811
Abstract: Systems, apparatuses, and methods for employing system probe filter aware last level cache insertion bypassing policies are disclosed. A system includes a plurality of processing nodes, a probe filter, and a shared cache. The probe filter monitors a rate of recall probes that are generated, and if the rate is greater than a first threshold, then the system initiates a cache partitioning and monitoring phase for the shared cache. Accordingly, the cache is partitioned into two portions. If the hit rate of a first portion is greater than a second threshold, then a second portion will have a non-bypass insertion policy since the cache is relatively useful in this scenario. However, if the hit rate of the first portion is less than or equal to the second threshold, then the second portion will have a bypass insertion policy since the cache is less useful in this case.
-
公开(公告)号:US20210073126A1
公开(公告)日:2021-03-11
申请号:US16562101
申请日:2019-09-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer
IPC: G06F12/0802
Abstract: Systems, apparatuses, and methods for dynamically adjusting cache policies to reduce execution core wait time are disclosed. A processor includes a cache subsystem. The cache subsystem includes one or more cache levels and one or more cache controllers. A cache controller partitions a cache level into two test portions and a remainder portion. The cache controller applies a first policy to the first test portion and applies a second policy to the second test portion. The cache controller determines the amount of time the execution core spends waiting on accesses to the first and second test portions. If the measured wait time is less for the first test portion than for the second test portion, then the cache controller applies the first policy to the remainder portion. Otherwise, the cache controller applies the second policy to the remainder portion.
-
公开(公告)号:US20170357596A1
公开(公告)日:2017-12-14
申请号:US15180982
申请日:2016-06-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer
IPC: G06F12/121 , G06F12/0888 , G06F12/0804 , G06F12/0891
CPC classification number: G06F12/121 , G06F12/0888 , G06F12/0891 , G06F12/0897 , G06F2212/283
Abstract: A first cache that includes a plurality of cache lines and is inclusive of a second cache. The plurality of cache lines are associated with a plurality of N-bit values. The first cache modifies each N-bit value in response to a hit at the corresponding one of the plurality of cache lines. The first cache bypasses eviction of a first cache line in response to the N-bit value associated with the first cache line having a first value and the first cache line being included in the second cache. The first cache evicts a second cache line in response to the N-bit value associated with the second cache line having a second value and the second cache line not being included in the second cache.
-
公开(公告)号:US20170357588A1
公开(公告)日:2017-12-14
申请号:US15180995
申请日:2016-06-13
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer
IPC: G06F12/0866
CPC classification number: G06F12/128 , G06F12/0811 , G06F12/0848 , G06F12/0862 , G06F12/0864 , G06F12/0866 , G06F12/121 , G06F12/123 , G06F12/127 , G06F2212/1021 , G06F2212/1024 , G06F2212/281 , G06F2212/502 , G06F2212/601
Abstract: A processing system includes a cache that includes a cache lines that are partitioned into a first subset of the cache lines and a second subsets of the cache lines. The processing system also includes one or more counters that are associated with the second subsets of the cache lines. The processing system further includes a processor configured to modify the one or more counters in response to a cache hit or a cache miss associated with the second subsets. The one or more counters are modified by an amount determined by one or more characteristics of a memory access request that generated the cache hit or the cache miss.
-
17.
公开(公告)号:US20240202116A1
公开(公告)日:2024-06-20
申请号:US18068930
申请日:2022-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Jagadish B. Kotra , John Kalamatianos , Paul James Moyer , Nicholas Dean Lance , Sriram Srinivasan , Patrick James Shyvers , William Louie Walker
IPC: G06F12/0802
CPC classification number: G06F12/0802 , G06F2212/1016 , G06F2212/1028 , G06F2212/1044
Abstract: An entry of a last level cache shadow tag array to track pending last level cache misses to private data in a previous level cache (e.g., an L2 cache), that also are misses to an exclusive last level cache (e.g., an L3 cache) and to the last level cache shadow tag array. Accordingly, last level cache miss status holding registers need not be expended to track cache misses to private data that are already being tracked by a previous level cache miss status holding register. Additionally or alternatively, up to a threshold number of last level cache pending misses to the same shared data from different processor cores are tracked in the last level cache shadow tag array, and any additional last level cache pending misses are tracked in a last level cache miss status holding register.
-
公开(公告)号:US11561895B2
公开(公告)日:2023-01-24
申请号:US16562101
申请日:2019-09-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer
IPC: G06F12/0802
Abstract: Systems, apparatuses, and methods for dynamically adjusting cache policies to reduce execution core wait time are disclosed. A processor includes a cache subsystem. The cache subsystem includes one or more cache levels and one or more cache controllers. A cache controller partitions a cache level into two test portions and a remainder portion. The cache controller applies a first policy to the first test portion and applies a second policy to the second test portion. The cache controller determines the amount of time the execution core spends waiting on accesses to the first and second test portions. If the measured wait time is less for the first test portion than for the second test portion, then the cache controller applies the first policy to the remainder portion. Otherwise, the cache controller applies the second policy to the remainder portion.
-
公开(公告)号:US11169812B2
公开(公告)日:2021-11-09
申请号:US16584701
申请日:2019-09-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt , Kai Troester
IPC: G06F9/38
Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.
-
公开(公告)号:US11106594B2
公开(公告)日:2021-08-31
申请号:US16562128
申请日:2019-09-05
Applicant: Advanced Micro Devices, Inc.
Inventor: Paul James Moyer , Douglas Benson Hunt
IPC: G06F12/0891 , G06F9/30 , G06F9/38 , G06F12/0811
Abstract: Systems, apparatuses, and methods for generating a measurement of write memory bandwidth are disclosed. A control unit monitors writes to a cache hierarchy. If a write to a cache line is a first time that the cache line is being modified since entering the cache hierarchy, then the control unit increments a write memory bandwidth counter. Otherwise, if the write is to a cache line that has already been modified since entering the cache hierarchy, then the write memory bandwidth counter is not incremented. The first write to a cache line is a proxy for write memory bandwidth since this will eventually cause a write to memory. The control unit uses the value of the write memory bandwidth counter to generate a measurement of the write memory bandwidth. Also, the control unit can maintain multiple counters for different thread classes to calculate the write memory bandwidth per thread class.
-
-
-
-
-
-
-
-
-