Systems and methods to identify production incidents and provide automated preventive and corrective measures
Abstract:
Various methods, apparatuses/systems, and media for identifying production incidents and implementing automated preventive and corrective measures are disclosed. A processor automatically triggers, in response to a generated incident of a job/process/host failure, a self-healing service. The processor identifies an application to which the event generated belongs to by accessing a database that stores the application and host details; fetches functional identification (ID) of the application from the database, identifies the type of job failure or service degradation; automatically executes, by utilizing predefined micro services, the steps required for mitigation; records, in response to executing, outcome of the mitigation in the database along with output at each stage of execution; and evaluates the outcome of the mitigation by executing health checks using micro services to determine whether the failed job or process or host is healthy; and closes the incident based on healthy determination.
Information query
Patent Agency Ranking
0/0