-
公开(公告)号:US11429466B2
公开(公告)日:2022-08-30
申请号:US16900909
申请日:2020-06-13
Applicant: Stratus Technologies Bermuda, Ltd.
Inventor: Charles J. Horvath , Lei Cao , Steven Michael Haid , John R. MacLeod , Angel L. Pagan , Nathaniel Horwitch Dailey , Wendy J. McNaughton , Stephen J. Wark
Abstract: A method and apparatus of performing fault tolerance in a fault tolerant computer system comprising: a primary node having a primary node processor; a secondary node having a secondary node processor, each node further comprising a respective memory; a respective checkpoint shim; each of the primary and secondary node further comprising: a respective non-virtual operating system (OS), the non-virtual OS comprising a respective; network driver; storage driver; and checkpoint engine; the method comprising the steps of: acting upon a request from a client by the respective OS of the primary and the secondary node, comparing the result obtained by the OS of the primary node and the secondary node by the network driver of the primary node for similarity, and if the comparison of indicates similarity less than a predetermined amount, the primary node network driver informs the primary node checkpoint engine to begin a checkpoint process.
-
公开(公告)号:US10360117B2
公开(公告)日:2019-07-23
申请号:US15626374
申请日:2017-06-19
Applicant: STRATUS TECHNOLOGIES BERMUDA LTD.
Inventor: Steven Michael Haid , Lei Cao , Aaron Tyrone Smith
Abstract: A checkpointing method in a network device fault tolerant system using virtual machines. In one embodiment, the network device has an input port, an output port, an active virtual machine and a standby virtual machine, a network application on the active virtual machine which manipulates data present on the input port and transmits the manipulated data from the output port; a checkpoint engine on the active virtual machine; and an interface agent, on the active virtual machine, having callable functions to move data from the input port to the output port. The method includes the steps of determining, by the checkpoint engine, that a checkpoint is required; requesting by the checkpoint engine that the interface agent quiescent itself; returning, by the interface agent to the network application, an indicator that no packets are available regardless of whether or not packets are arriving at the input port.
-
公开(公告)号:US20170364425A1
公开(公告)日:2017-12-21
申请号:US15626374
申请日:2017-06-19
Applicant: STRATUS TECHNOLOGIES BERMUDA LTD.
Inventor: Steven Michael Haid , Lei Cao , Aaron Tyrone Smith
CPC classification number: G06F11/2028 , G06F9/45558 , G06F11/1407 , G06F11/1466 , G06F11/1484 , G06F11/2097 , G06F2009/45575 , G06F2009/45579 , G06F2009/45591 , G06F2009/45595 , G06F2201/815
Abstract: A checkpointing method in a network device fault tolerant system using virtual machines. In one embodiment, the network device has an input port, an output port, an active virtual machine and a standby virtual machine, a network application on the active virtual machine which manipulates data present on the input port and transmits the manipulated data from the output port; a checkpoint engine on the active virtual machine; and an interface agent, on the active virtual machine, having callable functions to move data from the input port to the output port. The method includes the steps of determining, by the checkpoint engine, that a checkpoint is required; requesting by the checkpoint engine that the interface agent quiescent itself; returning, by the interface agent to the network application, an indicator that no packets are available regardless of whether or not packets are arriving at the input port.
-
公开(公告)号:US11641395B2
公开(公告)日:2023-05-02
申请号:US16900910
申请日:2020-06-13
Applicant: Stratus Technologies Bermuda, Ltd.
Inventor: Lei Cao
IPC: G06F15/16 , G06F9/54 , H04L29/06 , H04L67/1095 , H04L43/16 , G06F9/46 , G06Q10/0631
Abstract: In part, disclosure relates to a method of regulating checkpointing in an active active fault tolerant system. The method includes receiving a request from a client through a network at a primary computer; copying, by the primary computer, the request from the client to a secondary computer; processing the request from the client, using the primary computer, to generate a primary computer result; processing the copy of the request from the client, using the secondary computer, to generate a secondary computer result; comparing the primary computer result and the secondary computer result to obtain a comparison metric; determining whether a minimum checkpoint interval has been met or exceeded; and if the minimum checkpoint interval has not been met or exceeded, delay initiating a checkpoint process from primary computer to secondary computer.
-
公开(公告)号:US20220066887A1
公开(公告)日:2022-03-03
申请号:US17003808
申请日:2020-08-26
Applicant: Stratus Technologies Bermuda, Ltd.
Inventor: Charles J. Horvath , Lei Cao
Abstract: In part, the disclosure relates to a real-time fault tolerant system. The system may include a first computing device, a second computing, and a hardware interconnect. The first computing device may include one or more memory devices, one or more processors, a first network interface operable to receive device data and transmit output data over a time-slot-based bus, wherein the output data is generated from processing device data, and a first real-time checkpoint engine. The second computing device may include similar components or the same components as the first computing device. The hardware interconnect is operable to permit data exchange between the first computing device and the second computing device. Checkpoints may be generated by checkpoint engines during lower-priority communication time slots allocated on the time slot-based bus to avoid interfering with any real-time communications to or from the first and second computing devices.
-
公开(公告)号:US20210037092A1
公开(公告)日:2021-02-04
申请号:US16900910
申请日:2020-06-13
Applicant: Stratus Technologies Bermuda, Ltd.
Inventor: Lei Cao
Abstract: In part, disclosure relates to a method of regulating checkpointing in an active active fault tolerant system. The method includes receiving a request from a client through a network at a primary computer; copying, by the primary computer, the request from the client to a secondary computer; processing the request from the client, using the primary computer, to generate a primary computer result; processing the copy of the request from the client, using the secondary computer, to generate a secondary computer result; comparing the primary computer result and the secondary computer result to obtain a comparison metric; determining whether a minimum checkpoint interval has been met or exceeded; and if the minimum checkpoint interval has not been met or exceeded, delay initiating a checkpoint process from primary computer to secondary computer.
-
公开(公告)号:US20210034447A1
公开(公告)日:2021-02-04
申请号:US16900909
申请日:2020-06-13
Applicant: Stratus Technologies Bermuda, Ltd.
Inventor: Charles J. Horvath , Lei Cao , Steven Michael Haid , John R. MacLeod , Angel L. Pagan , Nathaniel Horwitch Dailey , Wendy J. McNaughton , Stephen J. Wark
IPC: G06F11/07 , G06F11/14 , G06F11/16 , G06F11/30 , G06F9/4401
Abstract: A method and apparatus of performing fault tolerance in a fault tolerant computer system comprising: a primary node having a primary node processor; a secondary node having a secondary node processor, each node further comprising a respective memory; a respective checkpoint shim; each of the primary and secondary node further comprising: a respective non-virtual operating system (OS), the non-virtual OS comprising a respective; network driver; storage driver; and checkpoint engine; the method comprising the steps of: acting upon a request from a client by the respective OS of the primary and the secondary node, comparing the result obtained by the OS of the primary node and the secondary node by the network driver of the primary node for similarity, and if the comparison of indicates similarity less than a predetermined amount, the primary node network driver informs the primary node checkpoint engine to begin a checkpoint process.
-
-
-
-
-
-