Abstract:
PROBLEM TO BE SOLVED: To provide a unified register renaming mechanism targeting various instruction types in a microprocessor. SOLUTION: This universal renaming mechanism renames addresses of the various instruction types, using single name structure. An instruction for updating a floating point register (FPR) can be thereby renamed together with an instruction for updating a general purpose register (GPR) or a vector multimedium extension (VMX) instruction register (VR), using the same renaming structure, because the number of states designed for the GPR is same to the number of states designed for the FPR and the GPR. Each address tag (DTAG) is allocated to one address, and a fixed point instruction is allocated to the next DTAG. Considerable amounts of silicon and electric power are saved by providing the single name structure for all the instruction types, in case of the universal renaming mechanism. COPYRIGHT: (C)2009,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To provide a configurable microprocessor which combines a plurality of corelets into a single microprocessor core to handle high computing-intensive workloads. SOLUTION: The process for forming the single microprocessor core first selects two or more corelets in the plurality of corelets. The process combines resources of the two or more corelets to form combined resources, wherein each combined resource comprises a larger amount of a resource available to each individual corelet. The process then forms a single microprocessor core from the two or more corelets by assigning the combined resources to the single microprocessor core, wherein the combined resources are dedicated to the single microprocessor core, and wherein the single microprocessor core processes instructions with the dedicated combined resources. COPYRIGHT: (C)2008,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To simultaneously execute a plurality of instructions, and thereby efficiently use hardware resources to increase the whole processor throughput. SOLUTION: A resource vector representing a necessary resource is encoded to a resource field, and the resource field is decoded in the subsequent step in order to derive the resource vector. The resource field is stored in an instruction cache related to respective program instructions. A processor operates in a simultaneous multithreading mode. When validity of a resource is equal to or exceeds a resource requirement of an instruction group, instructions thereof are simultaneously dispatched to hardware resources. A starting bit is inserted into one of the program instructions in order to define the instruction group. The hardware resource is, in particular, an execution unit such as a fixed decimal point unit 56, a load/store unit 58, a floating decimal point unit 60 or a branch processing unit 61. COPYRIGHT: (C)2006,JPO&NCIPI
Abstract:
PROBLEM TO BE SOLVED: To provide a method and a device adaptable to the temporal request for the high frequency design and accessible to an operand in a single cycle. SOLUTION: An operand buffer having a plurality of entries in which each entry is allocated to the command in generation queues. The operand buffer has the entries of the same number as that of the generation queues. A designed register and a register file for temporary data are input. Data in the operand buffer is written from the register file when the entry is written. When the command is executed, the corresponding entry in the operand buffer is unnecessary, and the entry is dis-allocated. The operand buffer has only entries smaller in number than the register file. Thus, the operand access stage requires the reading of not the register file but the operand buffer, and the operand buffer is read in one cycle.
Abstract:
PROBLEM TO BE SOLVED: To improve processor performance by completing much more instructions for each cycle. SOLUTION: Inside a super-scalar processor, each execution unit is provided with a related completion table and that table is provided with the copy in the status to be dispatched and of the all instructions not yet to be completed. A central completion table 132 holds the status of the all dispatched instructions reported by a dispatch unit and an individual execution unit. Further, an instruction capable of generating interruption and instruction capable of making the register of the same result as a target are retracted. The completion table related to the execution unit retracts the balance of the instructions and the execution unit transmits the instruction status to the central completion table 132 and each execution unit. As a result, the number of the instructions to be retracted by the central completion table 132 is reduced and the number of the instructions to be completed for each clock cycle is increased.
Abstract:
A system and process for managing thread transitions may include the ability to determine that a transition is to be made regarding the relative use of two data register sets and determine, based on the transition determination, whether to move thread data in at least one of the data register sets to second-level registers. The system and process may also include the ability to move the thread data from at least one data register set to second-level registers based on the move determination.
Abstract:
A system and process for managing thread execution includes providing two data register sets coupled to a processor and using, by the processor, the two register sets as first-level registers for thread execution. A portion of main memory or cache memory is assigned as second-level registers where the second-level registers serve as registers of at least one of the two data register sets for executing the threads. Data for the threads may be moved between the first-level registers and second-level registers for different modes of thread processing.
Abstract:
Ein System und ein Prozess zum Verwalten von Thread-Übergängen kann die Fähigkeit beinhalten, zu ermitteln, dass ein Übergang im Hinblick auf die relative Nutzung von zwei Datenregistersätzen vorzunehmen ist, und auf der Grundlage der Übergangsermittlung zu ermitteln, ob Thread-Daten in mindestens einem der Datenregistersätze auf Register der zweiten Ebene zu verschieben sind. Das System und der Prozess können zudem die Fähigkeit beinhalten, die Thread-Daten auf der Grundlage der Verschiebeermittlung von mindestens einem Datenregistersatz auf Register der zweiten Ebene zu verschieben.
Abstract:
Each execution unit within a superscalar processor has an associated completion table that contains a copy of the status of all instructions dispatched but not completed. A central completion table maintains the status of every dispatched instruction as reported by th e dispatch unit and the individual execution units. Execution units send finish signals to the completion table responsible for retiring a particular type of instruction. The central completion table retires instructions that may cause an interrupt and instructions whose results may target the same register. The execution units' associated completion tables retire the balance of the instructions and the execution units send instruction status to the central completion table and to each execution uni t. This reduces the number of instructions that are retired by the central completion table,increasing the number of instructions retired per clock cycle.