Abstract:
Under the present invention, a branch target address corresponding to a target instruction to be pre-fetched is predicted based on two values. The first value is a "predictor value" that is known for the branch target address. The second value is the address of the branch instruction from which the target instruction is branched to within the program code. Once these two values are provided, they can be processed (e.g., hashed) to yield an index value, which is used to obtain a predicted branch target address from a cache. This technique is generally implemented for branch instructions such as switch statements or polymorphic calls. In the case of the former, the predictor value is a selector operand, while in the case of the latter the predictor value is a class object address (in JAVA) or a virtual function table address (in C++).
Abstract:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
Abstract:
An interprocedural compilation method for aggregating global data variables in external storage to maximize data locality. Using the information displayed in a weighted interference graph in which node weights represent the size of data stored in each global variable and edges between variables represent access relationships between the globals, the global variables can be mapped into aggregates based on this frequency of access, while preventing the cumulative data size in any aggregate from exceeding a memory size restriction.
Abstract:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
Abstract:
Compiling code for an enhanced application binary interface (ABI) including identifying 602 a code sequence configured to perform a variable address reference table, such as a table of contents (TOC) function including an access to a variable at an offset outside of a location in a variable address reference table. The code sequence includes an internal representation (IR) of an instruction that will be expanded to multiple instructions that are adjacent to each other in the object file and corresponds to a reduced latency of IOP sequence when executed on a decode time instruction optimization (DTIO) enabled microprocessor. A modified scheduler cost function which is configured to recognize that the internal representation corresponds to the reduced latency is used. An object file is generated 606 responsive to the modified scheduler cost function to include expanding the internal representation (IR) as multiple adjacent instructions. The object file is emitted 608 for linking by a linker.
Abstract:
A code sequence made up multiple instructions and specifying an offset from a base address is identified in an object file. The offset from the base address corresponds to an offset location in a memory configured for storing an address of a variable or data. The identified code sequence is configured to perform a memory reference function or a memory address computation function. It is determined that the offset location is within a specified distance of the base address and that a replacement of the identified code sequence with a replacement code sequence will not alter program semantics. The identified code sequence in the object file is replaced with the replacement code sequence that includes a no-operation (NOP) instruction or having fewer instructions than the identified code sequence. Linked executable code is generated based on the object file and the linked executable code is emitted.
Abstract:
Generating decode time instruction optimization (DTIO) object code that enables a DTIO enabled processor to optimize execution of DTIO instructions. A code sequence configured to facilitate DTIO in a DTIO enabled processor is identified by a computer. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A schedule associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified schedule that is configured to place the first instruction next to the second instruction. An object file is generated based on the modified schedule. The object file includes the first instruction placed next to the second instruction. The object file is emitted.
Abstract:
Compiling code for an enhanced application binary interface (ABI) including identifying, by a computer, a code sequence configured to perform a variable address reference table function including an access to a variable at an offset outside of a location in a variable address reference table. The code sequence includes an internal representation (IR) of a first instruction and an IR of a second instruction. The second instruction is dependent on the first instruction. A scheduler cost function associated with at least one of the IR of the first instruction and the IR of the second instruction is modified. The modifying includes generating a modified scheduler cost function that is configured to place the first instruction next to the second instruction. An object file is generated responsive to the modified scheduler cost function. The object file includes the first instruction placed next to the second instruction. The object file is emitted.
Abstract:
A technique used during interprocedural compilation in which program objects are grouped together based on the weights of the connections between the objects and their costs. System-imposed constraints on memory size can be taken into account to avoid creating groupings that overload the system's capacity. The groupings can be distributed over memories located on different processors.