Coprocessor with Distributed Register
    11.
    发明申请

    公开(公告)号:US20200272467A1

    公开(公告)日:2020-08-27

    申请号:US16286213

    申请日:2019-02-26

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.

    Re-use of Speculative Load Instruction Results from Wrong Path

    公开(公告)号:US20240354109A1

    公开(公告)日:2024-10-24

    申请号:US18305151

    申请日:2023-04-21

    Applicant: Apple Inc.

    CPC classification number: G06F9/3802 G06F9/30043 G06F9/3016 G06F9/3861

    Abstract: Disclosed techniques relate to re-use of speculative results from an incorrect execution path. In some embodiments, when a control transfer instruction is mispredicted, a load instruction may have been executed on the wrong path. In disclosed embodiments, result storage circuitry records information that indicates destination registers of speculatively-executed load instructions including a first load instruction. Control flow tracker circuitry may store information indicating a reconvergence point for the control transfer instruction. Re-use control circuitry may track registers written by instructions prior to the reconvergence point, determine that the first load instruction does not depend on data from any instruction between the control transfer instruction and the reconvergence point, and use, as a result of the first load instruction, a value from a recorded destination register that was written based on speculative execution of the first load, notwithstanding the misprediction of the control transfer instruction.

    Zero cycle load bypass in a decode group

    公开(公告)号:US11416254B2

    公开(公告)日:2022-08-16

    申请号:US16705023

    申请日:2019-12-05

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for implementing zero cycle load bypass operations are described. A system includes a processor with at least a decode unit, control logic, mapper, and free list. When a load operation is detected, the control logic determines if the load operation qualifies to be converted to a zero cycle load bypass operation. Conditions for qualifying include the load operation being in the same decode group as an older store operation to the same address. Qualifying load operations are converted to zero cycle load bypass operations. A lookup of the free list is prevented for a zero cycle load bypass operation and a destination operand of the load is renamed with a same physical register identifier used for a source operand of the store. Also, the data of the store is bypassed to the load.

    Program counter capturing
    15.
    发明授权

    公开(公告)号:US09952863B1

    公开(公告)日:2018-04-24

    申请号:US14842421

    申请日:2015-09-01

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to capturing information related to instructions executing on in a processor. In one embodiment, an integrated circuit is disclosed that includes an execution pipeline configured to execute a sequence of instructions. The integrated circuit includes monitoring circuitry configured to monitor the execution pipeline for occurrences of an event associated with the sequence of instructions, and in response to detecting a particular number of occurrences of the event, capture a value of a program counter corresponding to an instruction of the sequence of instructions that is associated with an occurrence of the event. The monitoring circuitry stores the captured value of the program counter in a distinct capture register and signals an interrupt indicating that the captured value of the program counter is retrievable from the capture register. In some embodiments, a debugging application may retrieve the value and present it to a developer attempting perform code profiling.

Patent Agency Ranking