Systems and methods for applying checkpoints on a secondary computer in parallel with transmission

    公开(公告)号:US11288123B2

    公开(公告)日:2022-03-29

    申请号:US16900913

    申请日:2020-06-13

    Abstract: The disclosure relates to a method of checkpointing. The method may include determining, by the primary computer, when to initiate a checkpoint point operation; dividing, at the primary computer, checkpoint data into two or more groups, wherein each group includes one or more pages of memory; transmitting a first group to the secondary computer; upon receiving, by the secondary computer, the first group, correlating memory pages in the first group with pages in memory on the secondary computer; determining, at the secondary computer, which bytes of memory pages of the first group differ from the correlated pages stored in memory in the secondary computer; and applying data from the first group by swapping differences between the memory pages of the first group and the correlated memory pages stored in the secondary computer. Where at least some of these multiple operations are performed in parallel during a subset of the overall checkpoint operation. The simultaneous performance of various memory manage checkpoint operations is advantageous in various fault tolerant systems. The differences may be N-byte differences such as 8-byte differences.

    Method and apparatus for performing checkpointing on a network device

    公开(公告)号:US10360117B2

    公开(公告)日:2019-07-23

    申请号:US15626374

    申请日:2017-06-19

    Abstract: A checkpointing method in a network device fault tolerant system using virtual machines. In one embodiment, the network device has an input port, an output port, an active virtual machine and a standby virtual machine, a network application on the active virtual machine which manipulates data present on the input port and transmits the manipulated data from the output port; a checkpoint engine on the active virtual machine; and an interface agent, on the active virtual machine, having callable functions to move data from the input port to the output port. The method includes the steps of determining, by the checkpoint engine, that a checkpoint is required; requesting by the checkpoint engine that the interface agent quiescent itself; returning, by the interface agent to the network application, an indicator that no packets are available regardless of whether or not packets are arriving at the input port.

    Method for dirty-page tracking and full memory mirroring redundancy in a fault-tolerant server

    公开(公告)号:US10216598B2

    公开(公告)日:2019-02-26

    申请号:US15646769

    申请日:2017-07-11

    Abstract: A method of transferring memory from an active to a standby memory in an FT Server system. The method includes the steps of: reserving a portion of memory using BIOS; loading and initializing an FT Kernel Mode Driver; loading and initializing an FT Virtual Machine Manager (FTVMM) including the Second Level Address Translation table SLAT into the reserved memory. In another embodiment, the method includes tracking memory accesses using the FTVMM's SLAT in Reserved Memory and tracking “L2” Guest memory accesses by tracking the current Guest's SLAT and intercepting the Hypervisor's writes to the SLAT. In yet another embodiment, the method includes entering Brownout by collecting the D-Bits; invalidating the processor's cached SLAT translation entries, and copying the dirtied pages from the active memory to memory in the second Subsystem. In one embodiment, the method includes entering Blackout and moving the final dirty pages from active to the mirror memory.

    Method for Migrating Memory and Checkpoints in a Fault Tolerant System
    5.
    发明申请
    Method for Migrating Memory and Checkpoints in a Fault Tolerant System 审中-公开
    在容错系统中迁移内存和检查点的方法

    公开(公告)号:US20150205688A1

    公开(公告)日:2015-07-23

    申请号:US14571405

    申请日:2014-12-16

    CPC classification number: G06F11/1484 G06F11/1438 G06F11/203 G06F11/2097

    Abstract: A method of migrating memory from a primary computer to a secondary computer. In one embodiment, the method includes the steps of: (a) waiting for a checkpoint on the primary computer; (b) pausing the primary computer; (c) selecting a group of pages of memory to be transferred to the secondary computer; (d) transferring the selected group of pages of memory and checkpointed data; (e) restarting the primary computer; (f) waiting for a checkpoint on the primary computer; (g) pausing the primary computer; (h) selecting another group of pages of memory to be transferred; (i) transferring the other selected group of pages of memory and data checkpointed since the previous checkpoint to the secondary computer; (j) restarting the primary computer; and (k) repeating steps (f) through (j) until all the memory of the primary computer is transferred.

    Abstract translation: 将内存从主计算机迁移到辅助计算机的方法。 在一个实施例中,该方法包括以下步骤:(a)等待主计算机上的检查点; (b)暂停主电脑; (c)选择要传送到次计算机的一组存储器页面; (d)传送所选择的一组存储器页面和检查点数据; (e)重新启动主计算机; (f)等待主计算机上的检查点; (g)暂停主计算机; (h)选择要传送的另一组存储器页面; (i)将从先前检查点起检查的其他所选择的一组存储器和数据检测点传送到次要计算机; (j)重新启动主电脑; 和(k)重复步骤(f)至(j),直到主计算机的所有存储器被传送。

    Fault tolerant systems and methods incorporating a minimum checkpoint interval

    公开(公告)号:US11641395B2

    公开(公告)日:2023-05-02

    申请号:US16900910

    申请日:2020-06-13

    Inventor: Lei Cao

    Abstract: In part, disclosure relates to a method of regulating checkpointing in an active active fault tolerant system. The method includes receiving a request from a client through a network at a primary computer; copying, by the primary computer, the request from the client to a secondary computer; processing the request from the client, using the primary computer, to generate a primary computer result; processing the copy of the request from the client, using the secondary computer, to generate a secondary computer result; comparing the primary computer result and the secondary computer result to obtain a comparison metric; determining whether a minimum checkpoint interval has been met or exceeded; and if the minimum checkpoint interval has not been met or exceeded, delay initiating a checkpoint process from primary computer to secondary computer.

    Computer duplication and configuration management systems and methods

    公开(公告)号:US11620196B2

    公开(公告)日:2023-04-04

    申请号:US16900914

    申请日:2020-06-13

    Abstract: In part, the disclosure relates to systems and methods to rapidly copy the computer operating system, drivers and applications from a source computer to a target computer using a duplication engine. Once the copy is complete the source computer will resume execution, and the target computer will first alter its configuration (also referred to as a role or personality) and then resume execution conforming to its new configuration as indicated by a profile stored in protected or specialized memory. The profile can be value, a file, or other memory structure and is protected in the sense that the profile (and or the region of memory where it is stored) must not be overwritten by a state transfer from the source computer to the target computer.

    Checkpointing systems and methods using data forwarding
    8.
    发明授权
    Checkpointing systems and methods using data forwarding 有权
    使用数据转发的检查点系统和方法

    公开(公告)号:US09588844B2

    公开(公告)日:2017-03-07

    申请号:US14571408

    申请日:2014-12-16

    Abstract: In one aspect, the invention relates to a fault tolerant computing system. The system includes a primary virtual machine and a secondary virtual machine, wherein the primary and secondary virtual machines are in communication, wherein the primary virtual machine comprises a first checkpointing engine and a first network interface, wherein the secondary virtual machine comprises a second network interface, wherein the first checkpointing engine forwards a page of memory of the primary virtual machine to the second virtual machine such that the first checkpointing engine can checkpoint the page of memory without pausing the primary virtual machine.

    Abstract translation: 一方面,本发明涉及容错计算系统。 所述系统包括主虚拟机和辅助虚拟机,其中所述主虚拟机和所述次虚拟机处于通信中,其中所述主虚拟机包括第一检查点引擎和第一网络接口,其中所述辅助虚拟机包括第二网络接口 其中所述第一检查点引擎将所述主虚拟机的存储器的页面转发到所述第二虚拟机,使得所述第一检查点引擎可以在不暂停所述主虚拟机的情况下检查所述存储器页面。

    Systems and methods for checkpointing in a fault tolerant system

    公开(公告)号:US11281538B2

    公开(公告)日:2022-03-22

    申请号:US16900912

    申请日:2020-06-13

    Abstract: A method and system of checkpointing in a computing system having a primary node and a secondary node is disclosed. In one embodiment the method includes the steps of determining by the primary node to initiate a checkpoint process; sending a notification to the secondary node, by the primary node, of an impending checkpoint process; blocking, by the primary node, I/O requests from the Operating System (OS) that arrive at the primary node after the determination to initiate the checkpoint process; completing, by the primary node, active I/O requests for data received from the OS prior to the determination to initiate the checkpoint process, by accessing the primary node data storage; and upon receiving, by the primary node, a notice of checkpoint readiness from the secondary node, initiating a checkpoint process to move state and data from the primary node to the secondary node.

    Method for Dirty-Page Tracking and Full Memory Mirroring Redundancy in a Fault-Tolerant Server

    公开(公告)号:US20190018746A1

    公开(公告)日:2019-01-17

    申请号:US15646769

    申请日:2017-07-11

    Abstract: A method of transferring memory from an active to a standby memory in an FT Server system. The method includes the steps of: reserving a portion of memory using BIOS; loading and initializing an FT Kernel Mode Driver; loading and initializing an FT Virtual Machine Manager (FTVMM) including the Second Level Address Translation table SLAT into the reserved memory. In another embodiment, the method includes tracking memory accesses using the FTVMM's SLAT in Reserved Memory and tracking “L2” Guest memory accesses by tracking the current Guest's SLAT and intercepting the Hypervisor's writes to the SLAT. In yet another embodiment, the method includes entering Brownout by collecting the D-Bits; invalidating the processor's cached SLAT translation entries, and copying the dirtied pages from the active memory to memory in the second Subsystem. In one embodiment, the method includes entering Blackout and moving the final dirty pages from active to the mirror memory.

Patent Agency Ranking