Node failure source detection in distributed computing environments using machine learning
Abstract:
Sources of node failures in distributed computing environments can be determined using machine learning according to some aspects described herein. For example, prior to rebooting a node in a distributed computing environment, a computing system can execute a software agent to detect a failure with respect to the node. In response to detecting the failure, the computing system can input characteristics for the node into a trained machine learning model. The computing system can receive a source of the failure with respect to the node. The computing system can then automatically execute a recovery operation for the node based on the source of the failure.
Information query
Patent Agency Ranking
0/0