Abstract:
A data switch, a method and a computer program are provided for the transfer of data between multiple bus units in a memory system. Each bus unit is connected to a corresponding data ramp. Each data ramp is only directly connected to the adjacent data ramps. This forms at least one data ring that enables the transfer of data from each bus unit to any other bus unit in the memory system. A central arbiter manages the transfer of data between the data ramps and the transfer of data between the data ramp and its corresponding bus unit. A preferred embodiment contains four data rings, wherein two data rings transfer data clockwise and two data rings transfer data counter-clockwise.
Abstract:
A data switch, a method and a computer program are provided for the transfer of data between multiple bus units in a memory system. Each bus unit is connected to a corresponding data ramp. Each data ramp is only directly connected to the adjacent data ramps. This forms at least one data ring that enables the transfer of data from each bus unit to any other bus unit in the memory system. A central arbiter manages the transfer of data between the data ramps and the transfer of data between the data ramp and its corresponding bus unit. A preferred embodiment contains four data rings, wherein two data rings transfer data clockwise and two data rings transfer data counter-clockwise.
Abstract:
PROBLEM TO BE SOLVED: To reduce the consumption of internode bandwidth by communications maintaining coherence between accelerators and CPUs. SOLUTION: The CPUs and the accelerators may be clustered on separate nodes in a multiprocessing environment. Each node that contains a shared memory device may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, commands and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, the inter-chip bandwidth consumed for maintaining coherence may be reduced. COPYRIGHT: (C)2008,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To reduce consumption of inter-node bandwidth by communications maintaining coherence between accelerators and CPUs.SOLUTION: CPUs 210 and accelerators 220 may be clustered on separate nodes in a multiprocessing environment. Each node 0, 1 that contains a shared memory device 212, 222 may maintain a directory to track blocks of shared memory that may have been cached at other nodes. Therefore, command and addresses may be transmitted to processors and accelerators at other nodes only if a memory location has been cached outside of a node. Additionally, because accelerators generally do not access the same data as CPUs, only initial read, write, and synchronization operations may be transmitted to other nodes. Intermediate accesses to data may be performed non-coherently. As a result, inter-chip bandwidth consumed for maintaining coherence may be reduced.