Managing node failures in a computing environment
Abstract:
Node failures in a computing environment can be managed. For example, a computing device can determine a risk score for a node in the computing environment. The risk score can indicate a likelihood of the node failing. The computing device can also determine a risk-tolerance score for a job to be executed in the computing environment by analyzing job data associated with the job. The risk-tolerance score can indicate a susceptibility of the job to a failure of one or more nodes in the computing environment. The computing device can cause the job to be at least partially executed on the node based on the risk score for the node and the risk-tolerance score for the job.
Public/Granted literature
Information query
Patent Agency Ranking
0/0