ORCHESTRATION OF AUTOMATED VIRTUAL MACHINE FAILURE REPLACEMENT IN A NODE CLUSTER
Abstract:
The technology described herein is directed towards automating the replacement of a virtual machine when the hardware underlying the virtual machine fails, including in a cloud computing environment in which nodes in a cluster map to virtual machines being deployed within that cloud provider. An automated workflow to perform cluster self-healing is started upon detection of an unrecoverable instance failure of a virtual machine, e.g., because of underlying hardware failure. The failed virtual machine is terminated, and a new, replacement virtual machine that matches characteristics of the failed virtual machine is created to join the cluster. Data of the failed node is re-protected, such as by restoring data maintained with a protection scheme to remaining virtual machines of the cluster. When the data is re-protected and the replacement virtual machine has joined the cluster, the data is rebalanced across the cluster nodes, including to the new virtual machine.
Information query
Patent Agency Ranking
0/0