Abstract:
Various embodiments are generally directed to techniques for handling errors affecting the at least partially parallel performance of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node, an access component to perform a command received from a client device via a network to alter client device data stored in a first storage device coupled to the first node, a replication component to transmit a replica of the command to a second node via the network to enable performance of the replica by the second node at least partially in parallel, an error component to retry transmission of the replica based on a failure indicated by the second node and a status component to select a status indication to transmit to the client device based on the indication of failure and results of retrial of transmission of the replica.
Abstract:
In the field of forensic analysis of databases, a method of performing database rollback to a previous state of a database using a write-ahead log (WAL) includes: selecting, in the set of frames recorded to the WAL, a specific frame representing a specific revised content of a corresponding specific page; identifying, in the set of frames, a first subset of frames containing the specific frame and zero or more thereto chronologically preceding frames; extracting, from the set of pages of the database, a first subset of pages; extracting, from the corresponding pages of the first subset of frames, a second subset of pages; and performing, based on the contents of the first subset of pages and the revised contents of the second subset of pages, a rollback of the database to a previous state containing the revised content of the specific page.
Abstract:
Subsequent to a storage operation performed on the source instance by a source component, a synchronization message is sent to a replicated component for the replicated instance. The synchronization message is stored locally in a persistent storage location associated with the source component along with an indicator representative of a time the storage operation was performed. Pursuant to receipt of the synchronization message by the replicated component, the replicated component is updated to a dirty state to indicate a lack of full synchronization between the source and replicated instances.
Abstract:
A distributed data warehouse system may maintain data blocks on behalf of clients, and may store primary and secondary copies of each data block on different disks or nodes in a cluster. The warehouse system may back up data blocks in a remote key-value backup storage system. A restore operation may retrieve data blocks from backup storage using their unique identifiers as keys (while incoming queries are serviced) in response to a failure or a query targeting data that was lost or corrupted. The order in which data blocks are restored may be dependent on the relative likelihood that they will be accessed in the near future (e.g., based on how recently or frequently they were accessed, written, or backed up; the values of one or more access counters associated with each data block; or how recently a database table containing data in each data block was loaded).
Abstract:
Exemplary embodiments for optimizing disaster recovery systems during takeover operations are provided. In one embodiment, by way of example only, a flag is set in a replication grid manager to identify replication grid members to consult in a reconciliation process for resolving intersecting and non-intersecting data amongst the disaster recovery systems for a takeover operation, including indicating those of the replication grid members that acquired ownership over cartridges belonging to source systems.
Abstract:
A cluster system includes a plurality of computing nodes connected to a network, each node including one or more storage devices. The cluster system stores data and at least one of data replicas or erasure-coded segments across the plurality of nodes based on a redundancy policy. Further, configuration information, which may be indicative of a data placement of the data and the data replicas or erasure-coded segments on the plurality of nodes, is provided to each of the plurality of nodes. Additionally, each of the nodes may act as a first node which is configured to determine, upon a change of the redundancy policy, updated configuration information based on the change of the redundancy policy and to send a message including information indicating the change of the redundancy policy to the other nodes of the plurality of nodes.
Abstract:
According to embodiments of the present invention, a metadata file is transferred from the first system to the second system and a database on the second system is initialized based on the metadata file. An image, including information of the first system to be restored, is transferred from the first system to the second system, and restoration of the information to the second system based on the image is initiated. Prior to completion of the restoration, one or more log files indicating actions performed on the first system relating to the information to be restored is transferred from the first system to the initialized database on the second system. In response to completion of the restoration, the actions of the log files are performed to synchronize the restored data on the second system with the first system.
Abstract:
A distributed data store may provide volume recovery access to multiple recovery agents. A data volume may be maintained for a storage client at the distributed data store. Write access to the data volume may be granted according to a single writer consistency scheme. When a recovery event is detected for the data volume, the data volume may be made available to multiple recovery agents that may perform respective recovery operations. Upon first completion of a recovery operation for the data volume, granting access to the data volume according to the single writer consistency scheme may be resumed. In some embodiments, the distributed data store may be a log-structured data store.
Abstract:
A technique is provided for accumulating failures. A failure of a first row is detected in a group of array macros, the first row having first row address values. A mask has mask bits corresponding to each of the first row address values. The mask bits are initially in active status. A failure of a second row, having second row address values, is detected. When none of the first row address values matches the second row address values, and when mask bits are all in the active status, the array macros are determined to be bad. When at least one of the first row address values matches the second row address values, mask bits that correspond to at least one of the first row address values that match are kept in active status, and mask bits that correspond to non-matching first address values are set to inactive status.
Abstract:
Disclosed herein are a system, non-transitory computer readable medium, and method for recovering from an abnormal failure of a program. Changes made by a plurality of threads of the program are undone in a reverse order in which the changes were made.