Abstract:
A recovery system and method for performing site recovery utilizes recovery-specific metadata and files of protected clients at a primary site to recreate the protected clients at a secondary site. The recovery-specific metadata is collected from at least one component at the primary site, and stored with the files of protected clients at the primary site. The recovery-specific metadata and the files of the protected clients are replicated to the secondary site so that the protected clients can be recreated at the secondary site using the replicated information.
Abstract:
A system for maintaining a two-site configuration for continuous availability over long distances may include a first computing site configured to execute a first instance associated with a priority workload, the first instance being designated as an active instance; a second computing site configured to execute a second instance of the priority workload, the second instance being designated as a standby instance; a software replication module configured to replicate a unit of work data associated with the priority workload from a first data object associated with the active instance to a second data object associated with the standby instance, and a hardware replication module configured to replicate an image from a first storage volume to a copy on a second storage volume, wherein the first storage volume is associated with the first computing site, and the second storage volume is associated with a third computing site.
Abstract:
Systems and methods for performing data recovery are disclosed. A controller of a memory system may detect an error at a first page of memory and identify a data keep cache associated with the first page, the data keep cache associated with a primary XOR sum. The controller may further sense data stored at a second page and move the data to a first latch of the memory; sense data stored at a third page such that the data is present in a second latch of the memory; and calculate a restoration XOR sum based on the data of the second page and the data of the third page. The controller may further calculate the data of the first page based on the primary XOR sum and the restoration XOR sum, and restore the data of the first page.
Abstract:
A resiliency system detects and corrects memory errors reported by a memory system of a computing system using previously stored error correction information. When a program stores data into a memory location, the resiliency system executing on the computing system generates and stores error correction information. When the program then executes a load instruction to retrieve the data from the memory location, the load instruction completes normally if there is no memory error. If, however, there is a memory error, the computing system passes control to the resiliency system (e.g., via a trap) to handle the memory error. The resiliency system retrieves the error correction information for the memory location and re-creates the data of the memory location. The resiliency system stores the data as if the load instruction had completed normally and passes control to the next instruction of the program.
Abstract:
A technique is provided for accumulating failures. A failure of a first row is detected in a group of array macros, the first row having first row address values. A mask has mask bits corresponding to each of the first row address values. The mask bits are initially in active status. A failure of a second row, having second row address values, is detected. When none of the first row address values matches the second row address values, and when mask bits are all in the active status, the array macros are determined to be bad. When at least one of the first row address values matches the second row address values, mask bits that correspond to at least one of the first row address values that match are kept in active status, and mask bits that correspond to non-matching first address values are set to inactive status.
Abstract:
Method and system for asynchronously dispersing Disaster Recovery (DR) enabling data between a plurality of storage sites. The method comprises: receiving, at a primary storage site, a written block and a write frequency counter associated with the written block. In case the write frequency counter is below a threshold: receiving information dispersal parameters including number indicative of a size difference between said written block and DR enabling data based on said written block; number of slices to slice said DR enabling data into and data indicative of DR storage sites of said plurality of storage sites for storing said slices. Further calculating DR enabling data based on written block, wherein DR enabling data is larger than said written block by size difference; slicing DR enabling data in accordance with number of slices; and dispersing slices in accordance with data indicative of DR storage sites.
Abstract:
Capturing post-snapshot quiescence writes in an image backup. In one example embodiment, a method for capturing post-snapshot quiescence writes in an image backup may include taking a first snapshot of a source storage at a first point in time using a Volume Shadow Copy Service (VSS), identifying a first set of block positions of blocks that are allocated in the source storage at the first point in time, identifying a second set of block positions of blocks that are written to the first snapshot during post-snapshot quiescence of the first snapshot by the VSS or by one or more VSS writers, resulting in a first quiesced snapshot, calculating a third set of block positions by performing a Boolean OR operation on the first and second sets of block positions, and copying blocks in the third set of block positions from the first snapshot to a full image backup.
Abstract:
Disclosed in some examples is a method, the method including detecting that an RDMS is recovering from a failure; sending a request for a last committed transaction on a replication component to the replication component; receiving, from the replication component, the last committed transaction which identifies a transaction that was the last committed transaction at a replication component at a time of RDMS failure; determining that a transaction log on the RDMS includes a transaction that had not yet been replicated at the time of RDMS failure which was committed on the transaction log subsequent to the last committed transaction received from the replication component; and based on that determination rolling back the transaction that had not yet been replicated at the time of RDMS failure.
Abstract:
A method of recovering a registry includes accessing a plurality of registry zone files for the registry and archiving, on a first periodic basis, the plurality of registry zone files. Each of the registry zone files includes at least domain names, registrar IDs, and status information represented in a first predetermined format. The method also includes accessing bulk WHOIS data for the registry and archiving, on a second periodic basis, the bulk WHOIS data. The bulk WHOIS data includes at least nameserver server names, IP addresses, and status information represented in a second predetermined format. The method further includes validating one of the plurality of archived registry zone files based on a comparison between the plurality of registry zone files and the bulk WHOIS data, publishing the validated registry zone file to a second registry's nameservers, initiating a root zone change request, and updating authoritative nameservers.
Abstract:
A non-transitory computer readable storage medium that stores therein an investigation program for causing an information processing apparatus to execute processing, the processing includes creating, in a storage medium, a first dump file for writing out data in a memory in the information processing apparatus when an operating system detects a first abnormality, rebooting the information processing apparatus without erasing the data stored in the memory after the detection of the first abnormality and after the creation of the first dump file, creating, during the reboot, a first table that associates a plurality of page areas in the memory and a plurality of dump file areas in the first dump file that correspond to the page areas, and writing out, when a page area in the memory is released, data stored in the page area to the first dump file.