Abstract:
In high performance computing, the potential compute power in a data center will scale to and beyond a billion-billion calculations per second ("Exascale" computing levels). Limitations caused by hierarchical memory architectures where data is temporarily stored in slower or less available memories will increasingly limit high performance computing systems from approaching their maximum potential processing capabilities. Furthermore, time spent and power consumed copying data into and out of a slower tier memory will increase costs associated with high performance computing at an accelerating rate. New technologies, such as the novel Zero Copy Architecture disclosed herein, where each compute node writes locally for performance, yet can quickly access data globally with low latency will be required. The result is the ability to perform burst buffer operations and in situ analytics, visualization and computational steering without the need for a data copy or movement.
Abstract:
A scalable software stack is disclosed, in particular, the present disclosure provides a system and a method directed at allocating logical ownership of memory locations in a shared storage device among two or more associated compute devices that have access to the storage device. The logical ownership allocation can minimize potential conflicts between two simultaneous accesses occurring within the same memory location of the storage device.
Abstract:
A high performance computing system and method communicate data packets between computing nodes on a multi-lane communications link using a modified header bit encoding. Each data packet is provided with flow control information and error detection information, then divided into per-lane payloads. Sync header bits for each payload are added to the payloads in non-adjacent locations, thereby decreasing the probability that a single correlated burst error will invert both header bits. The encoded blocks that include the payload and the interspersed header bits are then simultaneously transmitted on the multiple lanes for reception, error detection, and reassembly by a receiving computing node.
Abstract:
An apparatus and method of accessing data in a memory in a multi-node, high performance computing system has a requesting agent and a home agent. The requesting agent is a member of a first node of the high performance computing system, while the home agent is a member of a second node of the high performance computing system. The home agent forwards data in a specified memory toward the requesting agent across the unordered network, and determines that a snoop request is to be sent to the requesting agent. After determining that the requesting agent has received the requested data in the specified memory of the second node, the home agent forwards the snoop request to the requesting agent across the unordered network.
Abstract:
Disclosed herein is a shared memory systems that use a combination of SBR and MRRR techniques to calculate eigenpairs for dense matrices having very large numbers of rows and columns. The disclosed system allows for the use of a highly scalable tridiagonal eigensolver. The disclosed system likewise allows for allocating a different number of tOOhreads to each of the different computational stages of the eigensolver.
Abstract:
An adapter retrieves graph data from one or more graph databases and adapts the data to be shown through a visualization tool. The adapter may be used to convert multiple formats of graph data into a format which is readable and useable by the visualization tool. The adapter module may make a connection with a graph database and query the database for particular graph data. Once retrieved, the stream of retrieved graph data may be used to populate a template in Java form. From the template, the visualization tool may provide a visualization of the retrieved data.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation and another computing processor for performing data movement and storage. Because data movement and storage are offloaded to the secondary processor, the processors for performing the computation are not interrupted to perform these tasks. A method for use in the HPC system receives instructions in the computing processors and first data in the memory. The method includes receiving second data into the memory while continuing to execute the instructions in the computing processors, without interruption. A computer program product implementing the method is also disclosed.
Abstract:
A primary data storage system is connected with a separate and external active archive storage system to consolidate data and allow active archive data to be managed based on primary storage system events. The primary data storage system may be managed and maintained by an external entity, and may include a manager module such as a resource manager. The active archive system may include several tiers of storage in a hierarchical storage system and logic for moving data between and among the tiers. As data processing milestones are completed or the state of data changes, in projects stored in the primary data storage system, task milestone or state change events are detected. Event detection can trigger data movement in the active archive solution. One or more software modules implementing the present invention may detect the events and trigger active archive operations based on the events.
Abstract:
A cooling system for a high performance computing system includes a closed-loop cooling cell having two compute racks and a cooling tower between the compute racks. Each compute rack includes one blade enclosure, and the cooling tower includes one water-cooled heat exchanger and one or more blowers configured to draw warm air from a side of the compute racks towards a back, across the water-cooled heat exchanger, and to circulate cooled air to a side of the compute racks towards a front. The cooling cell further includes a housing enclosing the compute racks and the cooling tower to provide a closed-loop air flow within the cooling cell. The cooling system further includes cooling plate(s) configured to be disposed between two computing boards within the computing blade, and a fluid connection coupled to the cooling plate and in fluid communication with the blade enclosure.
Abstract:
An apparatus and method thermally manage a high performance computing system having a plurality of nodes with microprocessors. To that end, the apparatus and method monitor the temperature of at least one of a) the environment of the high performance computing system and b) at least a portion of the high performance computing system. In response, the apparatus and method control the processing speed of at least one of the microprocessors on at least one of the plurality of nodes as a function of at least one of the monitored temperatures.