Thermal excursion detection in datacenter components
Abstract:
A fully out-of-band, automated process provides centralized, concurrent in-system cluster-level/hyper-scale CPU thermal excursion signature detection based on system event log (SEL) content. A chassis manager within a hyper-scale cluster operates as the initiator of a cluster-level SEL collection and analysis operation. The master chassis manager initiates a collect and analyze operation, which gets relayed to every chassis manager within the cluster. Every recipient chassis manager then further propagates the collect and analyze operation request to each of the server blades hosted within the chassis. A Baseboard Management Controller (BMC) on each server hosted within the chassis receives a cluster SEL read request from the chassis manager through out-of-band communication interfaces. The BMC forwards SEL data for the server back to the chassis manager, which analyzes the data received from all the servers on the chassis and then sends the results to next manager in the hierarchy.
Public/Granted literature
Information query
Patent Agency Ranking
0/0