Abstract:
The present disclosure relates to an apparatus and a method for collecting failure/error history lists to identify and categorize erring memory locations in randomly accessible memory of a computer system. Method and apparatus consistent with the present disclosure may identify whether particular memory cells, rows of memory cells, or columns of memory cells within a memory device are associated with transient or persistent errors. These methods and apparatus may also avoid using portions of memory that have been associated with persistent errors or failures.
Abstract:
A data visualization system with the capability of viewing large amounts of data in a parallel coordinates system. Large amounts of data are displayed in parallel coordinates by grouping together data points by bins and representing grouped data with fewer graphical elements. The fewer graphical elements simplify the graphical representation of the data while still providing information about the density or volume of data occupying a particular space. Bins are determined for each axis. The volume of connections between a pair of neighboring pair of bins may be represented by modifying an aspect of the connection based on the volume.
Abstract:
A center of rotation may automatically be selected for graphically displayed data based on what is determined to be of interest to the user, the current display of the data, and other parameters. For example, if a user has selected a portion of data, the center of rotation may be within the center of the selected data. If a user has positioned a cursor within a portion of displayed data, the center of rotation may be the center of the data portion including the cursor. If the data as a whole is approximately centered about the graphical coordinate origin, or within a threshold of the origin, the data may be rotated about the origin. If the data as a whole is approximately centered at least a certain distance away from the graphical coordinate origin, the data may be rotated about the center of the data as a whole.
Abstract:
The present disclosure is directed to a configurable extension space for a computer server or node blade that has the ability to expand data storage or other functionality to a computer system while minimizing any disruption to computers in a data center when the functionality of a computer server or a node blade is extended. Apparatus consistent with the present disclosure may include multiple electronic assemblies where a first assembly resides deep within an enclosure to which an expansion module may be attached in an accessible expansion space.
Abstract:
A system and method provide a communications link having a plurality of lanes, and an in-band, real-time physical layer protocol that keeps all lanes on-line, while failing lanes are removed, for continuous service during fail over operations. Lane status is monitored real-time at the physical layer receiver, where link error rate, per lane error performance, and other channel metrics are known. If a lane failure is established, a single round trip request / acknowledge protocol exchange with the remote port completes the fail over. If a failing lane meets an acceptable performance level, it remains on-line during the round trip exchange, resulting in uninterrupted link service. Lanes may be brought in or out of service to meet reliability, availability, and power consumption goals.
Abstract:
A computer system has a liquid cooling system with a main portion, a cold plate, and a closed fluid line extending between the main portion and the cold plate. The cold plate has an internal liquid chamber fluidly connected to the closed fluid line. The computer system also has a hot swappable computing module that is removably connectable with the cold plate. The cold plate and computing module are configured to maintain the closed fluid line between the main portion and the cold plate when the computing module is being connected to or removed from the cold plate.
Abstract:
The present technology provides a two step process for providing a linearized dynamic storage pool. First, physical storage devices are abstracted. The physical storage devices used for the pool are divided into extents, grouped by storage class, and stripes are created from data chunks of similar classified devices. A virtual volume is then provisioned from and the virtual volume is divided into virtual stripes. A volume map is created to map the virtual stripes with data to the physical stripes, linearly mapping the virtual layout to the physical capacity to maintain optimal performance.
Abstract:
A high performance computing (HPC) system includes computing blades having a first region that includes computing circuit boards having processors for performing a computation, and a second region that includes non-volatile memory for use in performing the computation. The regions are connected by a plurality of power connectors that convey power from the computing circuit boards to the memory, and a plurality of data connectors that convey data between the first and second regions. The power and data connectors are configured redundantly so that failure of a computing circuit board, a power connector, or a data connector does not interrupt the computation. A method of performing such a computation, and a computer program product implementing the method, are also disclosed.
Abstract:
A high performance computing system includes one or more blade enclosures configured to hold a plurality of computing blades, a connection interface, coupled to the one or more blade enclosures, having one or more connectors and a shared power bus that distributes power to the one or more blade enclosures, and at least one power shelf removably coupled to the one or more connectors and configured to hold one or more power supplies. The system may further include the computing blades and the power supplies. The power shelf may include a power distribution board configured to connect the power supplies together on the shared power bus.
Abstract:
The present disclosure is directed to monitoring power devices in a data center. The present disclosure describes systems, methods, and non-transitory computer readable storage mediums that provide increasing amounts of power monitoring and ultimately comprehensive power monitoring, power control, power failure forecasting, power event alerts, power data collection, and that manages power corrective actions. Systems, methods, and non-transitory computer readable storage mediums of the present disclosure may also gather intelligence by analyzing data trends over time such that design weaknesses can be identified an addressed in next generation data center computing and power distribution system designs.
Abstract translation:本公开涉及监视数据中心中的功率设备。 本公开描述了提供增加量的功率监视和最终全面的功率监视,功率控制,功率故障预测,功率事件警报,功率数据收集以及管理功率校正动作的系统,方法和非临时性计算机可读存储介质 。 本公开的系统,方法和非暂时性计算机可读存储介质还可以通过分析随时间推移的数据趋势来收集情报,使得可以识别设计弱点并在下一代数据中心计算和配电系统设计中寻址。 p >