Invention Grant
- Patent Title: Proactive cluster compute node migration at next checkpoint of cluster cluster upon predicted node failure
-
Application No.: US16022990Application Date: 2018-06-29
-
Publication No.: US10776225B2Publication Date: 2020-09-15
- Inventor: Cong Xu , Naveen Muralimanohar , Harumi Kuno
- Applicant: Hewlett Packard Enterprise Development LP
- Applicant Address: US TX Houston
- Assignee: Hewlett Packard Enterprise Development LP
- Current Assignee: Hewlett Packard Enterprise Development LP
- Current Assignee Address: US TX Houston
- Agent Michael A. Dryja
- Main IPC: G06F11/20
- IPC: G06F11/20 ; G06F11/14 ; G06F11/07 ; G06F11/00 ; G06F11/36 ; G06F9/48 ; G06F9/52 ; G06F9/54 ; G06F9/455 ; G06N20/00

Abstract:
While scheduled checkpoints are being taken of a cluster of active compute nodes distributively executing an application in parallel, a likelihood of failure of the active compute nodes is periodically and independently predicted. Responsive to the likelihood of failure of a given active compute node exceeding a threshold, the given active compute node is proactively migrated to a spare compute node of the cluster at a next scheduled checkpoint. Another spare compute node of the cluster can perform prediction and migration. Prediction can be based on both hardware events and software events regarding the active compute nodes.
Public/Granted literature
Information query