Abstract:
A method and apparatus for providing a memory model for hardware attributes to support transactional execution is herein described. Upon encountering a load of a hardware attribute, such as a test monitor operation to load a read monitor, write monitor, or buffering attribute, a fault is issued in response to a loss field indicating the hardware attribute has been lost. Furthermore, dependency actions, such as blocking and forwarding, are provided for the attribute access operations based on address dependency and access type dependency. As a result, different scenarios for attribute loss and testing thereof are allowed and restricted in a memory model.
Abstract:
A dynamic performance profiler is operable to receive, in substantially real-time, raw performance data from a testing platform. A software-based image is executing on a target hardware platform (e.g., either simulated or actual) on the testing platform, and the testing platform monitors such execution to generate corresponding raw performance data, which is communicated, in substantially real-time, as it is generated during execution of the software-based image to a dynamic profiler. The dynamic profiler may be configured to archive select portions of the received raw performance data to data storage. As the raw performance data is received, the dynamic profiler analyzes the data to determine whether the performance of the software-based image on the target hardware platform violates a predefined performance constraint. When the performance constraint is violated, the dynamic profiler archives a portion of the received raw performance.
Abstract:
An efficient, cycle-accurate processor execution simulator models a target processor by executing a program execution image comprising instructions having run-time dependencies resolved by execution on an existing processor compatible with the target processor. The instructions may have been executed upon a processor in an I/O environment too complex to model. In one embodiment, the simulator executes instructions that were directly executed on a processor. In another embodiment, a markup engine alters a compiled program image, with reference to instructions executed on a processor, to remove run-time dependencies. The marked up program image is then executed by the simulator. The processor execution simulator includes an update engine operative to cycle-accurately simulate instruction execution, and a communication engine operative to model each communication bus of the target processor.
Abstract:
The present invention related to computer architecture, and more specifically to evaluating performance of processors. A performance monitor may be placed in an L2 cache nest of a processor. The performance monitor may monitor L2 cache accesses and receive performance data from one or more processor cores over a bus coupling the processor cores with the L2 cache nest. In one embodiment the bus may include additional lines for transferring performance data from the processor cores to the performance monitor.
Abstract:
The invention relates to a method for power management of a microprocessor characterized by the use of a workload prediction method for adaptation of the power consumption of said microprocessor to workload during program execution. The method is characterized by measuring at least microprocessor utilization between two branch instructions. This is achieved by using a utilization history table similar to a branch history that profiles the microprocessor utilization during execution and is used for subsequent execution of the same code segment.
Abstract:
A memory module includes a memory hub coupled to several memory devices. The memory hub includes at least one performance counter that tracks one or more system metrics-for example, page hit rate, number or percentage of prefetch hits, cache hit rate or percentage, read rate, number of read requests, write rate, number of write requests, rate or percentage of memory bus utilization, local hub request rate or number, and/or remote hub request rate or number.
Abstract:
A load balancer detects a server failure, and sends a failure notification message to the remaining servers. In response, one or more of the remaining servers may autonomically adjust their configuration parameters, thereby allowing the remaining servers to better handle the increased load caused by the server failure. One or more of the servers may also include a performance measurement mechanism that measures performance before and after an autonomic adjustment of the configuration parameters to determine whether and how much the autonomic adjustments improved the system performance. In this manner server computer systems may autonomically compensate for the failure of another server computer system that was sharing the workload.
Abstract:
A system and method for metering usage of a data processing system and scaling system performance is disclosed. In one embodiment, an authorization key is purchased that specifies both a baseline performance level and a ceiling performance level. After the key is installed on the data processing system, the system performance level is monitored and averaged over predetermined time periods. The customer is charged on a “pas-as-you-go” basis for any time periods during which the average performance level exceeds the baseline performance level. Performance of the data processing system is not allowed to exceed the ceiling level obtained with the authorization key. In one embodiment, the baseline level may be set to zero so that all performance consumption is purchased by the customer as it is utilized. A report may be generated that includes data upon which analysis of the measured processor utilization data may be performed.
Abstract:
A method and apparatus for monitoring the performance characteristics of a multithreaded processor (10) executing instructions from two or more threads simultaneously. Event detectors detect the occurrence of specific processor events (20) during the execution of instructions from threads of a multithreaded processor. Specialized event select controls registers (30) are programmed to control the selection, masking and qualifying of events to be monitored. Events arequalified according to their thread ID and thread current privilege level (CPL). Each event that is qualified is counted by one of several programmable event counters (70) that keep track of all processor events being monitored. The contents of the event counters can then be accessed and sampled via a program instruction.
Abstract:
A performance monitor system includes a core processor (115), a core processor associated device, such as a cache (123), and first logic, such as performance logic (127). The core processor (115) is operable to execute information. The core processor associated device provides a first signal (CACHE_PERF), which defines performance of the core processor associated device (123) during operation of the core processor (115). The first logic (127) is coupled to the core processor associated device (123) and monitors the first signal (CACHE_PERF) in response to a second signal (WPT0,1), which defines a match of user-settable attributes associated with the operation of the core processor (115).