Abstract:
A system and method thereof for performing loss-less migration of an application group. In an exemplary embodiment, the system may include a high-availability services module structured for execution in conjunction with an operating system, and one or more computer nodes of a distributed system upon which at least one independent application can be executed upon. The high-availability services module may be structured to be executable on the one or more computer nodes for loss-less migration of the one or more independent applications, and is operable to perform checkpointing of all state in a transport connection.
Abstract:
A system, method, and computer readable medium for consistent and transparent replication of multi process multi threaded applications. The computer readable medium includes computer-executable instructions for execution by a processing system. Primary applications runs on primary hosts and one or more replicated instances of each primary application run on one or more backup hosts. Replica consistency between primary application and its replicas is provided by imposing the execution ordering of the primary on all its replicas. The execution ordering on a primary is captured by intercepting calls to the operating system and libraries, sending replication messages to its replicas, and using interception on the replicas to enforce said captured primary execution order. Replication consistency is provided without requiring modifications to the application, operating system or libraries.
Abstract:
A system, method, and computer readable medium for reliable messaging between two or more servers. The computer readable medium includes computer-executable instructions for execution by a processing system. Primary applications runs on primary hosts and one or more replicated instances of each primary application run on one or more backup hosts. The reliable messaging ensures consistent ordered delivery of messages in the event that messages are lost; arrive out of order, or in duplicate. The messaging layer operates over TCP or UDP with our without multi-cast and broad-cast and requires no modification to applications, operating system or libraries.
Abstract:
A system and method for storage checkpointing to a group of independent computer applications. The system has a storage disk that stores files; a storage access interface to access the storage disk; and a computer. The computer runs the group of independent computer applications and utilizes the files stored on the storage disk. A file system on the server accesses the files stored on the storage disk. An operating system and at least one device driver can be called by the file system, and at least one buffer buffers first data written to the storage disk and second data read from the storage disk.
Abstract:
A computer readable medium and method for providing checkpointing to Windows application groups. The checkpointing may be triggered asynchronously using Asynchronous Procedure Calls. The computer readable medium includes computer-executable instructions for execution by a processing system. The computer-executable instructions may be for reviewing one or more command line arguments to determine whether to start at least one of the application groups, and when determining to start the at least one of the application groups, creating a process table in a shared memory to store information about each process of the at least one of the application groups. Further, the instructions may be for registering with a kernel module to create an application group barrier, creating a named pipe for applications of the application group to register and unregister, triggering a checkpoint thread to initiate an application group checkpoint; and launching an initial application of the applications of the application group.
Abstract:
A method and system for storage checkpointing of an independent computer application. The independent computer application is launched by a coordinator; and the coordinator installs at least one of an exec interceptor and a fork interceptor. The coordinator also installs at least one file operations interceptor for all file operations and registers the independent computer application with the coordinator. The independent computer application is run and the at least one file operations interceptor is called upon encountering a file operation. The file operations interceptor logs a file event in a file operations database and passes the operation to at least one of a file system, an operating system, at least one or more device drivers, and a storage disk via a storage interface. The file operations interceptor also verifies that the file operation has been issued.
Abstract:
A system and method thereof for performing loss-less migration of an application group. In an exemplary embodiment, the system may include a high-availability services module structured for execution in conjunction with an operating system, and one or more computer nodes of a distributed system upon which at least one independent application can be executed upon. The high-availability services module may be structured to be executable on the one or more computer nodes for loss-less migration of the one or more independent applications, and is operable to perform checkpointing of all state in a transport connection.
Abstract:
A system includes a multi-process application that runs on primary hosts and is checkpointed by a checkpointer comprised of a kernel-mode checkpointer module and one or more user-space interceptors providing at least one of barrier synchronization, checkpointing thread, resource flushing, and an application virtualization space. Checkpoints may be written to storage and the application restored from said stored checkpoint at a later time. Checkpointing may be incremental using Page Table Entry (PTE) pages and Virtual Memory Areas (VMA) information. Checkpointing is transparent to the application and requires no modification to the application, operating system, networking stack or libraries. In an alternate embodiment the kernel-mode checkpointer is built into the kernel.
Abstract:
A system, method, and computer readable medium for statistical application-agnostic fault detection of multi-process applications. The computer readable medium includes computer-executable instructions for execution by a processing system. A multi-process application runs on a host. Interceptors collect statistical events and sends said events to a statistical fault detector. The statistical fault detector creates one or more distributions and compares recent statistical event data to historical statistical event data and uses deviation from historical norm for fault detection. The present invention detects faults both within the application and within the environment wherein the application executes, if conditions within the environment cause impaired application performance. The invention also teaches consensus fault detection and elimination of cascading fault notifications based on a hierarchy of events and event groups. Interception and fault detection is transparent to the application, operating system, networking stack and libraries.
Abstract:
Trend estimation for application-agnostic statistical fault detection of multi-process applications in environments with data trend includes at least one of: a multi-process application runs on a host. Statistical events are collected and sent to a statistical fault detector. The statistical fault detector creates one or more distributions and compares recent statistical event data to historical statistical event data and uses deviation from historical norm for fault detection. Trend is estimated, and if needed, removed from event data prior to the creation of distributions. Trend is estimated using spectral techniques, filter banks and Maximum Entry Spectral Estimation, and dominant frequencies are estimated and utilized to adapt to the environment.