Abstract:
A server includes a tray that has a front portion and a back portion. A motherboard is disposed in the front portion of the tray and the motherboard is coupled to a heat sink. A fan is disposed in the back portion of the tray. A hard drive is disposed between the motherboard and the fan and the hard drive is operatively connected to the motherboard. The server also includes a heat pipe that has a body longitudinally bounded by an inlet and an outlet. The inlet is coupled to the heat sink, while the outlet is coupled to the fan. The body of the heat pipe extends past the hard drive. A power supply is also disposed in the tray and is operatively connected to the motherboard, the fan, and the hard drive.
Abstract:
A computer system with read/write access to storage devices creates a snapshot of a data volume at a point in time while continuing to accept access requests to the mirrored data volume by copying before making changes to the base data volume. Multiple snapshots may be made of the same data volume at different points in time. Only data that is not stored in a previous snapshot volume or in the base data volume are stored in the most recent snapshot volume.
Abstract:
An apparatus and method thermally manage a high performance computing system having a plurality of nodes with microprocessors. To that end, the apparatus and method monitor the temperature of at least one of a) the environment of the high performance computing system and b) at least a portion of the high performance computing system. In response, the apparatus and method control the processing speed of at least one of the microprocessors on at least one of the plurality of nodes as a function of at least one of the monitored temperatures.
Abstract:
Embodiments of the invention relate to a system and method for dynamically scheduling resources using policies to self-optimize resource workloads in a data center. The object of the invention is to allocate resources in the data center dynamically corresponding to a set of policies that are configured by an administrator. Operational parametrics that correlate to the cost of ownership of the data center are monitored and compared to the set of policies configured by the administrator. When the operational parametrics approach or exceed levels that correspond to the set of policies, workloads in the data center are adjusted with the goal of minimizing the cost of ownership of the data center. Such parametrics include yet are not limited to those that relate to resiliency, power balancing, power consumption, power management, error rate, maintenance, and performance.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes computing circuit boards having processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation. The regions are connected by a plurality of power connectors that convey power from the computing circuit boards to the memory, and a plurality of data connectors that convey data between the first and second regions. The power and data connectors are configured redundantly so that failure of a computing circuit board, a power connector, or a data connector does not interrupt the computation. A method of performing such a computation, and a computer program product implementing the method, are also disclosed.
Abstract:
Disclosed herein is a shared memory system that use a combination of SBR and MRRR techniques to calculate eigenpairs for dense matrices having very large numbers of rows and columns. The disclosed system allows for the use of a highly scalable tridiagonal eigensolver. The disclosed system likewise allows for allocating a different number of threads to each of the different computational stages of the eigensolver.
Abstract:
A multiple channel data transfer system (10) includes a source (12) that generates data packets with sequence numbers for transfer over multiple request channels (14). Data packets are transferred over the multiple request channels (14) through a network (16) to a destination (18). The destination (18) re-orders the data packets received over the multiple request channels (14) into a proper sequence in response to the sequence numbers to facilitate data processing. The destination (18) provides appropriate reply packets to the source (12) over multiple response channels (20) to control the flow of data packets from the source (12).
Abstract:
A server includes a tray that has a front portion and a back portion. A motherboard is disposed in the front portion of the tray and the motherboard is coupled to a heat sink. A fan is disposed in the back portion of the tray. A hard drive is disposed between the motherboard and the fan and the hard drive is operatively connected to the motherboard. The server also includes a heat pipe that has a body longitudinally bounded by an inlet and an outlet. The inlet is coupled to the heat sink, while the outlet is coupled to the fan. The body of the heat pipe extends past the hard drive. A power supply is also disposed in the tray and is operatively connected to the motherboard, the fan, and the hard drive.
Abstract:
A two part process is used for modifying records to be written and retrieved from tape devices. A record is appended with a cyclic redundancy check and a string of zeros. Submitting the entire record to tape drives which are logical block protection enabled will result in no change. For drives that are not LBP enabled, the string of zeros at the end of the record is removed. In addition to determining whether a drive is LBP compliant, a determination may be made as to whether a drive is a linear tape open drive from a particular manufacturer. Linear tape open drives may behave similarly as drives which may not be enabled with logical block protection.
Abstract:
Error data is read from error registers and written into a buffer. A computing node uses a BIOS to read the error data, rearm the error register and write the data into a memory mapped buffer. A hub chip supports creation of a shared memory system of computing nodes. A management controller in the computing node extracts error data from the buffer. The error data preferably consists essentially of the error register identifiers and the contents of the error registers. A system management node receives the error data from the management controllers in the computing nodes. The system management node may be coupled to but separate from the computing nodes.