Selective use of branch prediction hints

    公开(公告)号:US12282778B2

    公开(公告)日:2025-04-22

    申请号:US18479974

    申请日:2023-10-03

    Abstract: Embodiments of apparatuses, methods, and systems for selective use of branch prediction hints are described. In an embodiment, an apparatus includes an instruction decoder and a branch predictor. The instruction decoder is to decode a branch instruction having a hint. The branch predictor is to provide a prediction and a hint-override indicator. The hint-override indicator is to indicate whether the prediction is based on stored information about the branch instruction. The prediction is to override the hint if the hint-override indicator indicates that the prediction is based on stored information about the branch instruction.

    Apparatuses, methods, and systems to precisely monitor memory store accesses

    公开(公告)号:US11915000B2

    公开(公告)日:2024-02-27

    申请号:US18160600

    申请日:2023-01-27

    Abstract: Systems, methods, and apparatuses relating to circuitry to precisely monitor memory store accesses are described. In one embodiment, a system includes a memory, a hardware processor core comprising a decoder to decode an instruction into a decoded instruction, an execution circuit to execute the decoded instruction to produce a resultant, a store buffer, and a retirement circuit to retire the instruction when a store request for the resultant from the execution circuit is queued into the store buffer for storage into the memory, and a performance monitoring circuit to mark the retired instruction for monitoring of post-retirement performance information between being queued in the store buffer and being stored in the memory, enable a store fence after the retired instruction to be inserted that causes previous store requests to complete within the memory, and on detection of completion of the store request for the instruction in the memory, store the post-retirement performance information in storage of the performance monitoring circuit.

    INSTRUCTION AND LOGIC FOR TRACKING FETCH PERFORMANCE BOTTLENECKS

    公开(公告)号:US20220261246A1

    公开(公告)日:2022-08-18

    申请号:US17675962

    申请日:2022-02-18

    Inventor: Ahmad Yasin

    Abstract: A processor includes a front end, an execution unit, a retirement stage, a counter, and a performance monitoring unit. The front end includes logic to receive an event instruction to enable supervision of a front end event that will delay execution of instructions. The execution unit includes logic to set a register with parameters for supervision of the front end event. The front end further includes logic to receive a candidate instruction and match the candidate instruction to the front end event. The counter includes logic to generate the front end event upon retirement of the candidate instruction.

    APPARATUS AND METHOD FOR ADAPTIVELY SCHEDULING WORK ON HETEROGENEOUS PROCESSING RESOURCES

    公开(公告)号:US20210200656A1

    公开(公告)日:2021-07-01

    申请号:US16728617

    申请日:2019-12-27

    Abstract: An apparatus and method for intelligently scheduling threads across a plurality of logical processors. For example, one embodiment of a processor comprises: a plurality of logical processors including comprising one or more of a first logical processor type and a second logical processor type, the first logical processor type associated with a first core type and the second logical processor type associated with a second core type; a scheduler to schedule a plurality of threads for execution on the plurality of logical processors in accordance with performance data associated with the plurality of threads; wherein if the performance data indicates that a new thread should be executed on a logical processor of the first logical processor type, but all logical processors of the first logical processor type are busy, the scheduler to determine whether to migrate a second thread from the logical processors of the first logical processor type to a logical processor of the second logical processor type based on an evaluation of first and second performance values associated with execution of the first thread on the first or second logical processor types, respectively, and further based on an evaluation of third and fourth performance values associated with execution of the second thread on the first or second logical processor types, respectively.

    PERFORMANCE MONITORING IN HETEROGENEOUS SYSTEMS

    公开(公告)号:US20210200580A1

    公开(公告)日:2021-07-01

    申请号:US16729370

    申请日:2019-12-28

    Abstract: Embodiments of apparatuses, methods, and systems for performance monitoring in heterogenous systems are described. In an embodiment, an apparatus includes a plurality of performance counters to generate a plurality of unweighted event counts; a weights storage to store a plurality of weight values, each weight value corresponding to an unweighted event count; a plurality of weighting units, each weighting unit to weight a corresponding unweighted event count based on a corresponding weight value to generate one of a plurality of weighted event counts; and a work counter to receive the weighted event counts and generate a measured work amount.

    Apparatus and method for multithreading-aware performance monitoring events

    公开(公告)号:US10997048B2

    公开(公告)日:2021-05-04

    申请号:US15395903

    申请日:2016-12-30

    Inventor: Ahmad Yasin

    Abstract: An apparatus and method are described for a multithreaded-aware performance monitor of a processor. For example, one embodiment of a processor comprises: one or more simultaneous multithreading cores to simultaneously execute multiple instruction threads; a plurality of performance monitor counters, each performance monitor counter to count baseline events during processing of the multiple instruction threads; and a performance monitor circuit to determine whether multiple threads are concurrently generating the same baseline event and, if so, then the performance monitor circuit to distribute the count of the baseline event for only one of the multiple threads in each processor cycle for which the multiple threads are active and the baseline event applies to.

    INSTRUCTION AND LOGIC FOR TRACKING FETCH PERFORMANCE BOTTLENECKS

    公开(公告)号:US20200319883A1

    公开(公告)日:2020-10-08

    申请号:US16831007

    申请日:2020-03-26

    Inventor: Ahmad Yasin

    Abstract: A processor includes a front end, an execution unit, a retirement stage, a counter, and a performance monitoring unit. The front end includes logic to receive an event instruction to enable supervision of a front end event that will delay execution of instructions. The execution unit includes logic to set a register with parameters for supervision of the front end event. The front end further includes logic to receive a candidate instruction and match the candidate instruction to the front end event. The counter includes logic to generate the front end event upon retirement of the candidate instruction.

Patent Agency Ranking