Abstract:
Usability of a cloud based service is recovered from a system failure. A customer transaction associated with the customer experience is executed to simulate the customer experience in the cloud based service. A failure associated with a subsystem the cloud based service is detected from an output of the customer transaction. A recovery action is determined to be associated with the failure. The recovery action is executed on the subsystem and monitored to determine a success status.
Abstract:
A method for byzantine fault-tolerant replication of data on a plurality of n servers includes performing, by a primary node (PN), a prepare procedure that includes computing a prepare message including a unique identifier and multicasting the prepare message to the REPN. The method further includes performing, by the PN, a commit procedure that includes receiving, from each of a portion of the REPN, a prepare message reply signature part and aggregating each of the prepare message reply signature parts to generate a prepare message reply aggregated signature, checking the validity of the prepare message reply aggregated signature, and upon determining that the prepare message reply aggregated signature is valid, computing a commit message including the prepare message reply aggregated signature and multicasting the commit message to the REPN. The method further includes transmitting, to the client, the commit message reply aggregated signature.
Abstract:
Technologies for virtual multipath access include a computing device configured to sequester a recovery partition from a host partition while allowing the recovery partition to access one or more resources of the host partition such as host memory or data storage. A remote computing device determines whether the host partition is responsive. The recovery partition receives a request for host state data of the host partition from the remote computing device in response to a determination that the host partition is not responsive. The recovery partition retrieves the requested host state data using a host state index maintained by the host partition and transmits the requested host state data to the remote computing device. The host state index may identify the location of the requested host state data. The remote computing device may perform a recovery operation based on the received host state data. Other embodiments are described and claimed.
Abstract:
A program is reliably rewritten within a short period of time to improve work efficiency. A controller includes a communication controller (2) having a communication area and a normal control controller (3) having a normal control area, and is provided in a vehicle. The controller stores an update program, which is transferred via an external communication means (N) from an external server (1) to a vehicle, in the communication controller (2). If it is determined, based on manipulation of a key switch (13) to a stop position, that updating can be performed, the controller transfers the update program stored in the communication controller (2) to the normal control controller (3) and performs rewriting.
Abstract:
Embodiments of the present invention disclose a cluster arbitration method and a multi-cluster cooperation system. The method in the embodiments of the present invention includes: detecting whether a fault has occurred in a first cluster group or a second cluster group, where the first cluster group includes one portion of a first cluster and one portion of a second cluster, and the second cluster group includes another portion of the first cluster and another portion of the second cluster, and the first cluster and the second cluster cooperate with each other; when detecting that a fault has occurred, determining, by the first cluster group and the second cluster group, respective preemption representatives, where both the preemption representative of the first cluster group and the preemption representative of the second cluster group perform the following steps: determining whether a fault has occurred in the respective cluster group, and if no fault has occurred in the respective cluster group, attempting to preempt an arbitration device, where a cluster group whose preemption representative has successfully preempted the arbitration device according to a preset arbitration mechanism survives. The present invention can reduce a probability of interruption of service access.
Abstract:
Systems, methods, and non-transitory computer-readable storage media for smart power clamping of a redundant power supply. A system configured according to this disclosure can measure, at a baseboard management controller, a system power consumption which indicates total power being delivered by a first power supply unit and a second power supply unit. The system can determine that the system power consumption exceeds a system power consumption capacity and, in response to the determination, communicate a power clamping signal to a processor, resulting in a reduced system power consumption. The system can further identify that the reduced system power consumption exceeds the system power consumption capacity and initiate a hardware throttling of at least one of the first power supply unit and the second power supply unit.
Abstract:
Embodiments of the present invention disclose a cluster arbitration method and a multi-cluster cooperation system. The method in the embodiments of the present invention includes: detecting whether a fault has occurred in a first cluster group or a second cluster group, where the first cluster group includes one portion of a first cluster and one portion of a second cluster, and the second cluster group includes another portion of the first cluster and another portion of the second cluster, and the first cluster and the second cluster cooperate with each other; when detecting that a fault has occurred, determining, by the first cluster group and the second cluster group, respective preemption representatives, where both the preemption representative of the first cluster group and the preemption representative of the second cluster group perform the following steps: determining whether a fault has occurred in the respective cluster group, and if no fault has occurred in the respective cluster group, attempting to preempt an arbitration device, where a cluster group whose preemption representative has successfully preempted the arbitration device according to a preset arbitration mechanism survives. The present invention can reduce a probability of interruption of service access.