Abstract:
PROBLEM TO BE SOLVED: To reduce the consumption of internode bandwidth by communications maintaining coherence between accelerators and CPUs. SOLUTION: The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced. COPYRIGHT: (C)2008,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To reduce consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs.SOLUTION: CPUs 210 and accelerators 220 may be clustered on separate nodes in a multiprocessing environment. Each node 0, 1 that contains a shared memory device 212, 222 may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, command and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, inter-chip bandwidth consumed for maintaining coherence may be reduced.
Abstract:
PROBLEM TO BE SOLVED: To monitor the performance for every thread in a multi-thread processor by recording individually the 1st and 2nd events which are generated in response to the 1st and 2nd threads respectively. SOLUTION: A processor 10 includes a performance monitor 50 which supports-the performance monitoring jobs independent of each of plural parallel threads which are supported by the processor 10 itself. The monitor 50 receives the event occurrences which are generated by the operations of an IU 25, an FP 26, an FX 30, an SC 23, a BIU 12 and an L2 cache interface 58 as its input. Then the selected one of these many event occurrences is recorded into a software readable/writable PMC(performance monitor counter) that is included in the monitor 50.
Abstract:
A system and method for performing computer processing operations in a data processing system (10) includes a multithreaded processor (100) and thread switch logic (400). The multithreaded processor is capable of switching between two or more threads of instractions which can be independently executed. Each thread has a corresponding state in a thread state register (440) depending on its execution status. The thread switch logic contains a thread switch control register (410) to store the conditions upon which a thread will occur. The thread switch logic has a time-out register (430) which forces a thread switch when execution of the active thread in the multithreaded processor exceeds a programmable period of time. Thread switch logic also has a forward progress count register (420) to prevent repetitive thread switching between threads in the multithreaded processor. Thread switch logic also is responsive to a software manager (460) capable of changing the priority of the different threads and thus superseding thread switch events.
Abstract:
A method and system for performance monitoring within a multithreaded processor are provided. The system includes a processor responsive to instructions within first and second threads and a performance monitor that separately records a first event generated by the processor in response to the first thread and a second event generated by the processor in response to the second thread. In one embodiment, the processor has first and second modes of operation. In this embodiment, when the performance monitor is operating in the first mode, a first counter within the performance monitor increments in response to each occurrence of the first event and a second counter within the performance monitor increments in response to each occurrence of the second event. Alternatively, when the performance monitor is operating in the second mode, the first counter increments in response to each occurrence of the first event and in response to each occurrence of the second event.
Abstract:
A system and method for performing computer processing operations in a data processing system (10) includes a multithreaded processor (100) and thread switch logic (400). The multithreaded processor is capable of switching between two or more threads of instructions which can be independently executed. Each thread has a corresponding state in a thread state register (440) depending on its execution status. The thread switch logic contains a thread switch control register (410) to store the conditions upon which a thread will occur. The thread switch logic has a time-out register (430) which forces a thread switch when execution of the active thread in the multithreaded processor exceeds a programmable period of time. Thread switch logic also has a forward progress count register (420) to prevent repetitive thread switching between threads in the multithreaded processor. Thread switch logic also is responsive to a software manager (460) capable of changing the priority of the different threads and thus superseding thread switch events.
Abstract:
A computer system in which each of certain critical instructions, all performing multiple main storage accesses to shared data, have the appearance of executing required main storage accesses atomically with respect to a predefined set or class of instructions. The instructions in each set, referred to as relatively atomic instructions, are grouped together based on the data structure or object class they affect. The computer system comprises (a) shared memory means (203); (b) a plurality of processors (201, 202,..., n) coupled to said shared memory means, wherein each processor has an instruction set divided into a plurality of instruction classes; (c) means for constraining an instruction in one of said classes running on one of said plurality of processors, to run atomically relative to any instruction in said class running on any other of said plurality of processors in said system; (d) means for signalling (280, 281, 282) between said processors to indicate when an instruction in one of said classes is running and for providing an indication of which particular class the instruction is a member of; and (e) means for selectively delaying the operation of all other instructions in said particular class on every other processor in said system.
Abstract:
A method and system for performance monitoring within a multithreaded processor are provided. The system includes a processor responsive to instructions within first and second threads and a performance monitor that separately records a first event generated by the processor in response to the first thread and a second event generated by the processor in response to the second thread. In one embodiment, the processor has first and second modes of operation. In this embodiment, when the performance monitor is operating in the first mode, a first counter within the performance monitor increments in response to each occurrence of the first event and a second counter within the performance monitor increments in response to each occurrence of the second event. Alternatively, when the performance monitor is operating in the second mode, the first counter increments in response to each occurrence of the first event and in response to each occurrence of the second event.
Abstract:
A method and system for performance monitoring within a multithreaded processor are provided. The system includes a processor responsive to instructions within first and second threads and a performance monitor that separately records a first event generated by the processor in response to the first thread and a second event generated by the processor in response to the second thread. In one embodiment, the processor has first and second modes of operation. In this embodiment, when the performance monitor is operating in the first mode, a first counter within the performance monitor increments in response to each occurrence of the first event and a second counter within the performance monitor increments in response to each occurrence of the second event. Alternatively, when the performance monitor is operating in the second mode, the first counter increments in response to each occurrence of the first event and in response to each occurrence of the second event.