-
公开(公告)号:US10860319B2
公开(公告)日:2020-12-08
申请号:US15941976
申请日:2018-03-30
Applicant: Intel Corporation
Inventor: Mark Dechene , Manjunath Shevgoor , Faruk Guvenilir , Zhongying Zhang , Jonathan Perry
IPC: G06F9/30 , G06F9/32 , G06F12/1027 , G06F9/38
Abstract: An apparatus and method for early page address prediction. For example, one embodiment of a processor comprises: an instruction fetch circuit to fetch a load instruction; a decoder to decode the load instruction; execution circuitry to execute the load instruction to perform a load operation, the execution circuitry including an address generation unit (AGU) to generate an effective address to be used for the load operation; and early page prediction (EPP) circuitry to use one or more attributes associated with the load instruction to predict a physical page address for the load instruction simultaneously with the AGU generating the effective address and/or prior to generation of the effective address.
-
公开(公告)号:US10853078B2
公开(公告)日:2020-12-01
申请号:US16231313
申请日:2018-12-21
Applicant: Intel Corporation
Inventor: Vineeth Mekkat , Mark Dechene , Zhongying Zhang , John Faistl , Janghaeng Lee , Hou-Jen Ko , Sebastian Winkel , Oleg Margulis
Abstract: A processor includes a store buffer to store store instructions to be processed to store data in main memory, a load buffer to store load instructions to be processed to load data from main memory, and a loop invariant code motion (LICM) protection structure coupled to the store buffer and the load buffer. The LPT tracks information to compare an address of a store or snoop microoperation with entries in the LICM and re-loads a load microoperation of a matching entry.
-
3.
公开(公告)号:US20200310801A1
公开(公告)日:2020-10-01
申请号:US16367171
申请日:2019-03-27
Applicant: Intel Corporation
Inventor: Mark Dechene , Srikanth Srinivasan , Matthew Merten , Ammon Christiansen
Abstract: A processor and method are described for a multi-level reservation station. For example, one embodiment of an apparatus comprises: execution circuitry comprising a plurality of functional units to execute a plurality of operations; a reservation station comprising a plurality of entries to store a corresponding plurality of operations to be executed on one or more of the functional units, the reservation station comprising: a first RS level to hold a first subset of the plurality of operations which are ready for execution by one or more functional units or which are expected to be ready for execution by the functional units; a second RS level to hold a second subset of the plurality of operations which are not expected to be ready for execution by the functional units; operation evaluation circuitry to evaluate operations in the first RS level and, responsive to identifying one or more operations which are not expected to be ready for execution, to cause the one or more operations to be moved from the first RS level to the second RS level.
-
公开(公告)号:US20240126702A1
公开(公告)日:2024-04-18
申请号:US17949803
申请日:2022-09-21
Applicant: Intel Corporation
Inventor: Mark Dechene , Ryan Carlson , Sudeepto Majumdar , Rafael Trapani Possignolo , Paula Petrica , Richard Klass , Meenakshi Marathe
IPC: G06F12/1027 , G06F12/0882
CPC classification number: G06F12/1027 , G06F12/0882 , G06F2212/1021
Abstract: Techniques for slicing memory of a hardware processor core by linear address are described. In certain examples, a hardware processor core includes memory circuitry having: a cache comprising a plurality of slices of memory, wherein each of a plurality of cache lines of memory are only stored in a single slice, and each slice stores a different range of address values compared to any other slice, wherein each of the plurality of slices of memory comprises: an incomplete load buffer to store a load address from the address generation circuit for a load request operation, broadcast to the plurality of slices of memory by the memory circuit from the execution circuit, in response to the load address being within a range of address values of that memory slice, a store address buffer to store a store address from the address generation circuit for a store request operation, broadcast to the plurality of slices of memory by the memory circuit from the execution circuit, in response to the store address being within a range of address values of that memory slice, a store data buffer to store data, including the data for the store request operation that is to be stored at the store address, for each store request operation broadcast to the plurality of slices of memory by the memory circuit from the execution circuit, and a store completion buffer to store the data for the store request operation in response to the store address being stored in the store address buffer of that memory slice, and, in response, clear the store address for the store request operation from the store address buffer and clear the data for the store request operation from the store data buffer.
-
公开(公告)号:US20240111679A1
公开(公告)日:2024-04-04
申请号:US17958334
申请日:2022-10-01
Applicant: Intel Corporation
Inventor: Seth Pugsley , Mark Dechene , Ryan Carlson , Manjunath Shevgoor
IPC: G06F12/0862 , G06F9/345 , G06F12/0882
CPC classification number: G06F12/0862 , G06F9/3455 , G06F12/0882
Abstract: Techniques for prefetching by a hardware processor are described. In certain examples, a hardware processor includes execution circuitry, cache memories, and prefetcher circuitry. The execution circuitry is to execute instructions to access data at a memory address. The cache memories include a first cache memory at a first cache level and a second cache memory at a second cache level. The prefetcher circuitry is to prefetch the data from a system memory to at least one of the plurality of cache memories, and it includes a first-level prefetcher to prefetch the data to the first cache memory, a second-level prefetcher to prefetch the data to the second cache memory, and a plurality of prefetch filters. One of the prefetch filters is to filter exclusively for the first-level prefetcher. Another of the prefetch filters is to maintain a history of demand and prefetch accesses to pages in the system memory and to use the history to provide training information to the second-level prefetcher.
-
公开(公告)号:US20240037036A1
公开(公告)日:2024-02-01
申请号:US17876081
申请日:2022-07-28
Applicant: Intel Corporation
Inventor: Mark Dechene , Ryan Carlson , Ricardo Daniel Queiros Alves , Yan Zeng , Richard Klass , Brendan West
IPC: G06F12/0815 , G06F12/0864
CPC classification number: G06F12/0815 , G06F12/0864 , G06F2212/1021
Abstract: Techniques for scheduling merged store operations are described. In an embodiment, an apparatus includes a data cache; a fill buffer; a store buffer to store first information associated with a first retired store operation and second information associated with a second retired store operation; a store coalescing buffer (SCB) to receive the first information from the store buffer, to store the first information in an SCB entry, to merge the second information from the store buffer into the entry, and to provide data associated with the entry for a write to the data cache or the fill buffer; and a global store scheduler (GSS) to schedule the write relative to an other write from an other SCB in compliance with one or more store ordering rules.
-
7.
公开(公告)号:US10956160B2
公开(公告)日:2021-03-23
申请号:US16367171
申请日:2019-03-27
Applicant: Intel Corporation
Inventor: Mark Dechene , Srikanth Srinivasan , Matthew Merten , Ammon Christiansen
Abstract: A processor and method are described for a multi-level reservation station. For example, one embodiment of an apparatus comprises: execution circuitry comprising a plurality of functional units to execute a plurality of operations; a reservation station comprising a plurality of entries to store a corresponding plurality of operations to be executed on one or more of the functional units, the reservation station comprising: a first RS level to hold a first subset of the plurality of operations which are ready for execution by one or more functional units or which are expected to be ready for execution by the functional units; a second RS level to hold a second subset of the plurality of operations which are not expected to be ready for execution by the functional units; operation evaluation circuitry to evaluate operations in the first RS level and, responsive to identifying one or more operations which are not expected to be ready for execution, to cause the one or more operations to be moved from the first RS level to the second RS level.
-
-
-
-
-
-