-
公开(公告)号:US20240354109A1
公开(公告)日:2024-10-24
申请号:US18305151
申请日:2023-04-21
Applicant: Apple Inc.
Inventor: Yuan C. Chou , Deepankar Duggal , Debasish Chandra , Niket K. Choudhary , Richard F. Russo
CPC classification number: G06F9/3802 , G06F9/30043 , G06F9/3016 , G06F9/3861
Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a control transfer instruction is mispredicted, a load instruction may have been executed on the wrong path. In disclosed embodiments, result storage circuitry records information that indicates destination registers of speculatively-executed load instructions including a first load instruction. Control flow tracker circuitry may store information indicating a reconvergence point for the control transfer instruction. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine that the first load instruction does not depend on data from any instruction between the control transfer instruction and the reconvergence point, and use, as a result of the first load instruction, a value from a recorded destination register that was written based on speculative execution of the first load, notwithstanding the misprediction of the control transfer instruction.
-
公开(公告)号:US12265823B2
公开(公告)日:2025-04-01
申请号:US18352323
申请日:2023-07-14
Applicant: Apple Inc.
Inventor: Ilhyun Kim , Niket K. Choudhary , Muawya M. Al-Otoom , Pruthivi Vuyyuru , Ronald P. Hall
Abstract: Disclosed techniques relate to trace caches. Trace cache circuitry may identify traces that satisfy one or more criteria. Generally, internal branches of a trace should satisfy a threshold bias level in a particular direction. To achieve this goal, the processor may initially assume that branches meet the threshold, track their usefulness in the trace context over time, and prevent inclusion of branches that fall below a usefulness threshold (which indicates that those branches are not sufficiently biased). Branches that do not meet the threshold may be added to a Bloom filter, for example. Usefulness may be tracked during trace training, when valid in a trace cache, or both.
-
公开(公告)号:US20240354111A1
公开(公告)日:2024-10-24
申请号:US18305173
申请日:2023-04-21
Applicant: Apple Inc.
Inventor: Yuan C. Chou , Deepankar Duggal , Debasish Chandra , Niket K. Choudhary , Richard F. Russo
CPC classification number: G06F9/3842 , G06F9/3005 , G06F9/3016
Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a first control transfer instruction is mispredicted, a second control transfer instruction may have been executed on the wrong path because of the misprediction. Result storage circuitry may record information indicating a determined direction for the second control transfer instruction. Control flow tracker circuitry may store, for the first control transfer instruction, information indicating a reconvergence point. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine, based on the tracked registers, that the second control transfer instruction does not depend on data from any instruction between the first control transfer instruction and the reconvergence point, and use the recorded determined direction for the second control transfer instruction, notwithstanding the misprediction of the first control transfer instruction.
-
公开(公告)号:US20250021337A1
公开(公告)日:2025-01-16
申请号:US18352351
申请日:2023-07-14
Applicant: Apple Inc.
Inventor: Muawya M. Al-Otoom , Niket K. Choudhary , Pruthivi Vuyyuru
IPC: G06F9/38
Abstract: Disclosed techniques relate to branch prediction and trace caching. A processor may include both an instruction cache and a trace cache configured to store instructions. A branch predictor may include one or more prediction tables (e.g., tagged geometric length (TAGE) tables) configured to predict directions of conditional control transfer instructions. Rather than including a separate branch predictor for branches in the trace cache, the processor may share the prediction table(s) for instruction cache and trace cache predictions. In particular, the processor may include an additional trace prediction lane configured to access the prediction table to predict a direction of a final control transfer instruction in a trace cached by the trace cache circuitry. This may advantageously provide accurate predictions with limited impacts to circuit area and power consumption, e.g., relative to a separate predictor for the trace cache.
-
公开(公告)号:US20250021332A1
公开(公告)日:2025-01-16
申请号:US18352309
申请日:2023-07-14
Applicant: Apple Inc.
Inventor: Niket K. Choudhary , Muawya M. Al-Otoom , Pruthivi Vuyyuru , Andrew H. Lin , Ilhyun Kim , Douglas C. Holman , Samir Dutt , Ronald P. Hall
Abstract: Disclosed techniques relate to trace cache circuitry configured to identify and cache traces that satisfy certain criteria. Prediction circuitry may track directions of executed control transfer instructions, including a first category of control transfer instructions that meet a first threshold bias level toward a given direction (which may be referred to as “stable”) and a second category of control transfer instructions that do not meet the first threshold bias level (which may be referred to as “unstable”). Trace cache circuitry may identify traces of instructions that satisfy a set of criteria, including: only control transfer instructions of the first category are allowed as internal control transfer instructions and a control transfer instruction in the second category is allowed only at an end of a given trace. Disclosed techniques may advantageously provide performance and power advantages of trace caching with reduced complexity, relative to certain traditional trace caches.
-
公开(公告)号:US20250147767A1
公开(公告)日:2025-05-08
申请号:US19009790
申请日:2025-01-03
Applicant: Apple Inc.
Inventor: Madhu Sudan Hari , Mridul Agarwal , Kulin N. Kothari , John D. Pape , Niket K. Choudhary
IPC: G06F9/38 , G06F9/30 , G06F9/52 , G06F12/1027
Abstract: A system may include multiple processors. One of the processors may receive an indication of a data synchronization barrier (DSB) instruction in another processor that follows a translation look-ahead buffer invalidate (TLBI) instruction to invalidate an entry of a translation look-ahead buffer. The processor may determine whether instructions are pending in the processor for which the virtual addresses used for memory accesses have been translated to physical addresses before receiving the DSB indication. If there are such pending instructions, the processor may provide, after these instructions retire, an indication to the other processor as a response to the DSB indication.
-
公开(公告)号:US12236244B1
公开(公告)日:2025-02-25
申请号:US17810253
申请日:2022-06-30
Applicant: Apple Inc.
Inventor: Wei-Han Lien , Muawya M. Al-Otoom , Ian D. Kountanis , Niket K. Choudhary , Pruthivi Vuyyuru
IPC: G06F9/38
Abstract: A multi-degree branch predictor is disclosed. A processing circuit includes an instruction fetch circuit configured to fetch branch instructions, and a branch prediction circuit having a plurality of prediction subcircuits. The prediction subcircuits are configured to store different amounts of branch history data with respect to other ones, and to receive an indication of a given branch instruction in a particular clock cycle. The prediction subcircuits implement a common branch prediction scheme to output, in different clock cycles, corresponding predictions for the given branch instruction using the different amounts of branch history data and cause, instruction fetches to be performed by the instruction fetch circuit. The prediction subcircuits are also configured to override, in subsequent clock cycles, instruction fetches caused by prediction subcircuits with comparatively less branch history data based on contrary predictions performed in subsequent clock cycles by prediction subcircuits with more branch history data.
-
公开(公告)号:US20250021338A1
公开(公告)日:2025-01-16
申请号:US18352326
申请日:2023-07-14
Applicant: Apple Inc.
Inventor: Muawya M. Al-Otoom , Niket K. Choudhary , Pruthivi Vuyyuru
IPC: G06F9/38
Abstract: Disclosed techniques relate to next fetch predictor circuitry configured to operate in conjunction with a trace cache. The trace cache circuitry may identify and store traces of instructions based on predicted directions of one or more control transfer instructions. Trace next fetch predictor circuitry may predict a next fetch address based on a current fetch address for a current cycle, which may include predicting a next fetch address following execution of a first trace stored in the trace cache circuitry. The first trace may include multiple fetch groups and multiple control transfer instructions. Arbitration circuitry may select from among multiple predictors and the trace next fetch predictor may have priority in response to a trace cache hit. Disclosed techniques may advantageously improve overall fetch bandwidth in the context of trace cache hits.
-
公开(公告)号:US20250021333A1
公开(公告)日:2025-01-16
申请号:US18352323
申请日:2023-07-14
Applicant: Apple Inc.
Inventor: Ilhyun Kim , Niket K. Choudhary , Muawya M. Al-Otoom , Pruthivi Vuyyuru , Ronald P. Hall
Abstract: Disclosed techniques relate to trace caches. Trace cache circuitry may identify traces that satisfy one or more criteria. Generally, internal branches of a trace should satisfy a threshold bias level in a particular direction. To achieve this goal, the processor may initially assume that branches meet the threshold, track their usefulness in the trace context over time, and prevent inclusion of branches that fall below a usefulness threshold (which indicates that those branches are not sufficiently biased). Branches that do not meet the threshold may be added to a Bloom filter, for example. Usefulness may be tracked during trace training, when valid in a trace cache, or both.
-
公开(公告)号:US20240028339A1
公开(公告)日:2024-01-25
申请号:US17814729
申请日:2022-07-25
Applicant: Apple Inc.
Inventor: Niket K. Choudhary , Mary D. Brown , Ethan R. Schuchman , Ronald P. Hall , Ian D. Kountanis , Douglas C. Holman , Ilhyun Kim , Abhishek Kumar , Siavash Zangeneh Kamali
IPC: G06F9/38 , G06F12/0875
CPC classification number: G06F9/3802 , G06F12/0875 , G06F2212/452
Abstract: An apparatus includes an instruction cache circuit and an instruction fetch circuit. The instruction fetch circuit is configured to retrieve, from the instruction cache circuit, a fetch group that includes a plurality of instructions for execution by a processing circuit, and to make a determination that the fetch group includes a control transfer instruction that is predicted to be taken. A target address associated with the control transfer instruction is directed to an instruction within the fetch group. The instruction fetch circuit is further configured to, based on the determination, alter instructions within the fetch group in a manner that is based on a type of the control transfer instruction.
-
-
-
-
-
-
-
-
-