-
公开(公告)号:CA2290289C
公开(公告)日:2005-07-12
申请号:CA2290289
申请日:1999-11-22
Applicant: IBM
Inventor: BLOCK TIMOTHY ROY , RABE RODNEY LEE
IPC: G06F15/177 , G06F11/00 , G06F11/07 , G06F11/30 , G06F15/16
Abstract: The preferred embodiment of the present invention provides a cluster node distress system and method that improves the reliability of a cluster. The cluster node distress system provides a cluster node distress signal when a node within the cluster is about to fail. This allows the cluster to better to determine whether a non-communicating node has failed or has merely been partitioned from the cluster. The preferred cluster node distress system is embedded deeply into the operating system and provides a pre-built node distress signal that can be quickly sent to other nodes within the cluster when an imminent failure of any node is detected, improving the probability that the node distress signal will get out before the node totally fails. When the node distress signal is effectively sent to other nodes within the cluster, the cluster can accurately determine which node has failed and has not just been partitioned from the cluster. This allows the cluster to respo nd correctly, i.e., by assigning other nodes primary and backup responsibility, with less manual intervention and troubleshooting needed by administrators.
-
公开(公告)号:CA2290289A1
公开(公告)日:2000-09-30
申请号:CA2290289
申请日:1999-11-22
Applicant: IBM
Inventor: BLOCK TIMOTHY ROY , RABE RODNEY LEE
IPC: G06F15/177 , G06F11/00 , G06F11/07 , G06F11/30 , G06F15/16
Abstract: The preferred embodiment of the present invention provides a cluster node distress system and method that improves the reliability of a cluster. The cluster node distress system provides a cluster node distress signal when a node within the cluster is about to fail. This allows the cluster to better to determine whether a non-communicating node has failed or has merely been partitioned from the cluster. The preferred cluster node distress system is embedded deeply into the operating system and provides a pre-built node distress signal that can be quickly sent to other nodes within the cluster when an imminent failure of any node is detected, improving the probability that the node distress signal will get out before the node totally fails. When the node distress signal is effectively sent to other nodes within the cluster, the cluster can accurately determine which node has failed and has not just been partitioned from the cluster. This allows the cluster to respond correctly, i.e., by assigning other nodes primary and backup responsibility, with less manual intervention and troubleshooting needed by administrators.
-