Abstract:
PROBLEM TO BE SOLVED: To provide a method for issuing instructions from an issue queue. SOLUTION: A processor includes the issue queue that can advance instructions toward issue even though some instructions in the queue are not ready-to-issue. The issue queue includes a matrix of storage cells configured in rows and columns which are coupled to execution units. Instructions advance toward issuance from row to row as unoccupied storage cells appear. Unoccupied cells appear when instructions advance toward a first row and upon issuance. When a particular row includes an instruction that is not ready-to-issue, a stall condition occurs for that instruction. However, to prevent the entire issue queue and the processor from stalling, a ready-to-issue instruction in another row may bypass the row including the stalled or not-ready-to-issue instruction. Out-of-order issuance of instructions to the execution units thus continues. COPYRIGHT: (C)2007,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To provide a system and a method for processing multicycle non-pipelined command sequencing. SOLUTION: In the system and the method, when a non-pipelined command is detected at an issuing point, issuing logic begins a stall in the minimum number of cycles, enough to complete the highest-speed non-pipelined command. Subsequently, an execution unit succeeds the stall until the non-pipelined command has actually been completed. Slightly before completing the command, the execution unit releases the stall to the issuing logic. COPYRIGHT: (C)2007,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To obtain a method which prevents an error caused by collision between preload and a storage instruction. SOLUTION: A processor 100 includes a preload queue 160 for storage of plural preload entries. Each preload entry is related to a preload instruction and includes defined address and byte count and a related identifier. A comparison unit 170 related to the preload queue 160 discriminates each of preload entries related to preload instructions which collide with older storage instructions. The oldest preload instruction related to one of these preload entries indicates target preload. In order to correct the collision between the target preload and the storage instruction, this target preload and all instructions executed after the target preload are flashed.
Abstract:
PROBLEM TO BE SOLVED: To transfer stored data to a necessary load instruction without stalling the long instruction until storage completion by transferring store data to the load instruction when a store instruction is already converted, a load address range is included in a store address range, and the store data are usable. SOLUTION: This is a method for transferring data as the result of a store instruction which does not have updated data to the load instruction and a CPU 120 judges whether or not there is a common byte between the address of the load instruction and the address of the store instruction. Further, it is judged whether or not the load instruction is logically behind the store instruction. When there is the common byte between the address of the load instruction and the address of the store instruction and when the load instruction is logically behind the store instruction, the data is transferred to the load instruction.
Abstract:
An issue unit for placing a processor into a gradual slow down mode of operation is provided. The gradual slow down mode of operation comprises a plurality of stages of slow down operation of an issue unit in a processor in which the issuance of instructions is slowed in accordance with a staging scheme. The gradual slow down of the processor allows the processor to break out of livelock conditions. Moreover, since the slow down is gradual, the processor may flexibly avoid various degrees of livelock conditions. The mechanisms of the illustrative embodiments impact the overall processor performance based on the severity of the livelock condition by taking a small performance impact on less severe livelock conditions and only increasing the processor performance impact when the livelock condition is more severe.
Abstract:
IN A SUPERSCALAR PROCESSOR (210) IMPLEMENTING OUT-OF-ORDER DISPATCHING AND EXECUTION OF LOAD AND STORE INSTRUCTIONS, WHEN A STORE INSTRUCTION HAS ALREADY BEEN TRANSLATED, THE LOAD ADDRESS RANGE OF A LOAD INSTRUCTION IS CONTAINED WITHIN THE ADDRESS RANGE OF THE STORE INSTRUCTION, AND THE DATA ASSOCIATED WITH THE STORE INSTRUCTION IS AVAILABLE, THEN THE DATA ASSOCIATED WITH THE STORE INSTRUCTION IS FORWARDED TO THE LOAD INSTRUCTION SO THAT THE LOAD INSTRUCTION MAY CONTINUE EXECUTION WITHOUT HAVING TO BE STALLED OR FLUSHED.
Abstract:
A system and method is provided for improving throughput of an in-order multithreading processor. A dependent instruction is identified to follow at least one long latency instruction with register dependencies from a first thread. The dependent instruction is recycled by providing it to an earlier pipeline stage. The dependent instruction is delayed at dispatch. The completion of the long latency instruction is detected from the first thread. An alternate thread is allowed to issue one or more instructions while the long latency instruction is being executed.
Abstract:
In a superscalar processor implementing out-of-order dispatching and execution of load and store instructions, when a store instruction has already been translated, the load address range of a load instruction is contained within the address range of the store instruction, and the data associated with the store instruction is available, then the data associated with the store instruction is forwarded to the load instruction so that the load instruction may continue execution without having to be stalled or flushed.
Abstract:
A system and method is provided for improving throughput of an in-order multithreading processor. A dependent instruction is identified to follow at least one long latency instruction with register dependencies from a first thread. The dependent instruction is recycled by providing it to an earlier pipeline stage. The dependent instruction is delayed at dispatch. The completion of the long latency instruction is detected from the first thread . An alternate thread is allowed to issue one or more instructions while the long latency instruction is being executed.
Abstract:
In a superscalar processor implementing out-of-order dispatching and execution of load and store instructions, when a store instruction has already been translated, the load address range of a load instruction is contained within the address range of the store instruction, and the data associated with the store instruction is available, then the data associated with the store instruction is forwarded to the load instruction so that the load instruction may continue execution without having to be stalled or flushed.