Abstract:
PROBLEM TO BE SOLVED: To optimize execution of an application in a compiler. SOLUTION: In a method for making a computer execute, a plurality of code regions of an application are instrumented with annotations for generating profile data (S410), the execution of the application instrumented with code regions generates profile data for each of the plurality of code regions (S420), a delinquent code region is identified on the basis of the profile data (S430), a plurality of code partial regions of the delinquent code region are instrumented with annotations for generating profile data (S440), the execution of the application having the instrumented code partial regions generates profile data (S450), the delinquent code partial region is identified on the basis of the generated profile data (S460), and application execution is optimized by using the delinquent code partial region (S470). COPYRIGHT: (C)2011,JPO&INPIT
Abstract:
Systems, methods and articles of manufacture are disclosed for optimizing execution of an application. A plurality of code regions of the application may be instrumented with annotations for generating profile data for each of the plurality of code regions. Profile data for each of the plurality of code regions may be generated via executing the application having instrumented code regions. A delinquent code region may be identified based on the generated profile data for each of the plurality of code regions. A plurality of code sub-regions of the identified delinquent code region may be instrumented with annotations for generating profile data for each of the plurality of code sub-regions. Profile data for each of the plurality of code sub-regions may be generated via executing the application having instrumented code sub-regions. A delinquent code sub-region may be identified based on the generated profile data for each of the plurality of code sub-regions. Execution of the application may be optimized using the identified delinquent code sub-region.
Abstract:
An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to the global unique version number of the main thread for the identified code region and an iteration distance between the assist thread relative to the main thread is compared to a predefined value. The assist thread is executed when the local version number of the assist thread matches the global unique version number of the main thread, and the iteration distance between the assist thread relative to the main thread is within a predefined range of values.
Abstract:
An illustrative embodiment of a computer-implemented process for a computer-implemented process for code versioning for enabling transactional memory region promotion receives a portion of candidate source code and outlines the portion of candidate source code received for parallel execution. The computer-implemented process further wraps a critical region with entry and exit routines to enter into a speculation sub-process, wherein the entry and exit routines also gather conflict statistics at runtime. The outlined code portion is executed to determine to use a particular one of multiple loop versions according to the conflict statistics gathered at run time.
Abstract:
An illustrative embodiment provides a computer-implemented process for managing multiple speculative assist threads for data pre-fetching that sends a comma nd from an assist thread of a first processor to second processor and a memory, wherein parameters of the command specify a processor identifier of the second processor, responsive t o receiving the command, reply by the second processor indicating an ability to receive a cache line that is a target of a pre-fetch, responsive to receiving the command replying by the memory indicating a capability to provide the cache line, responsive to receiving replies from t he second processor and the memory, sending, by the first processor, a combined response to the second processor and the memory, wherein the combined response indicates an action, and responsive to the action indicating a transaction can continue sending the requested cache line, by t he memory, to the second processor into a target cache level on the second processor.
Abstract:
A compiling program with cache utilization optimizations employs an inter- procedural global analysis of the data access patterns of compile units to be processed . The global analysis determines sufficient information to allow intelligent application of optimization techniques to be employed to enhance the operation and utilization of the available cache systems on target hardware.
Abstract:
An illustrative embodiment provides a computer-implemented process for may--constant propagation, obtains a source code, and generates a set of associated data structures from the source code and a set of may-constant data structures. The computer--implemented process identifies a candidate code for may-constant propagation to form an identified candidate code, updates the set of may-constant data structures, and selects an identified candidate code using information in the may-constant data structures, including probability, to form a selected candidate code. The computer-implemented process further identifies a code region associated with the selected candidate code to form an identified code region and modifies the identified code region including the selected candidate code.
Abstract:
An illustrative embodiment provides a computer-implemented process for managing speculative assist threads for data pre-fetching that analyzes collected source code and cache profiling information to identify a code region containing a delinquent load instruction and generates an assist thread, including a value for a local version number, at a program entry point within the identified code region. Upon activation of the assist thread the local version number of the assist thread is compared to the global unique version number of the main thread for the identified code region and an iteration distance between the assist thread relative to the main thread is compared to a predefined value. The assist thread is executed when the local version number of the assist thread matches the global unique version number of the main thread, and the iteration distance between the assist thread relative to the main thread is within a predefined range of values.
Abstract:
An illustrative embodiment provides a computer-implemented process for managing multiple speculative assist threads for data pre-fetching that sends a command from an assist thread of a first processor to second processor and a memory, wherein parameters of the command specify a processor identifier of the second processor, responsive to receiving the command, reply by the second processor indicating an ability to receive a cache line that is a target of a pre-fetch, responsive to receiving the command replying by the memory indicating a capability to provide the cache line, responsive to receiving replies from the second processor and the memory, sending, by the first processor, a combined response to the second processor and the memory, wherein the combined response indicates an action, and responsive to the action indicating a transaction can continue sending the requested cache line, by the memory, to the second processor into a target cache level on the second processor.