Invention Grant
- Patent Title: Link failure detection in a parallel computer
- Patent Title (中): 并行计算机中的链路故障检测
-
Application No.: US11832940Application Date: 2007-08-02
-
Publication No.: US07831866B2Publication Date: 2010-11-09
- Inventor: Charles J. Archer , Michael A. Blocksome , Mark G. Megerian , Brian E. Smith
- Applicant: Charles J. Archer , Michael A. Blocksome , Mark G. Megerian , Brian E. Smith
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Biggers & Ohanian, LLP
- Agent James R. Nock
- Main IPC: G06F11/00
- IPC: G06F11/00

Abstract:
Methods, apparatus, and products are disclosed for link failure detection in a parallel computer including compute nodes connected in a rectangular mesh network, each pair of adjacent compute nodes in the rectangular mesh network connected together using a pair of links, that includes: assigning each compute node to either a first group or a second group such that adjacent compute nodes in the rectangular mesh network are assigned to different groups; sending, by each of the compute nodes assigned to the first group, a first test message to each adjacent compute node assigned to the second group; determining, by each of the compute nodes assigned to the second group, whether the first test message was received from each adjacent compute node assigned to the first group; and notifying a user, by each of the compute nodes assigned to the second group, whether the first test message was received.
Public/Granted literature
- US20090037773A1 Link Failure Detection in a Parallel Computer Public/Granted day:2009-02-05
Information query