Abstract:
Various embodiments are generally directed to techniques for handling errors affecting the at least partially parallel performance of data access commands between nodes of a storage cluster system. An apparatus may include a processor component of a first node, an access component to perform a command received from a client device via a network to alter client device data stored in a first storage device coupled to the first node, a replication component to transmit a replica of the command to a second node via the network to enable performance of the replica by the second node at least partially in parallel, an error component to retry transmission of the replica based on a failure indicated by the second node and a status component to select a status indication to transmit to the client device based on the indication of failure and results of retrial of transmission of the replica.
Abstract:
Machines, systems and methods for recovering data objects in a distributed data storage system, the method comprising storing one or more replicas of a first data object on one or more clusters in one or more data centers connected over a data communications network; recording health information about said one or more replicas, wherein the health information comprises data about availability of a replica to participate in a restoration process; calculating a query-priority for the first data object; querying, based on the calculated query-priority, the health information for the one or more replicas to determine which of the one or more replicas is available for restoration of the object data; calculating a restoration-priority for the first data object based on the health information for the one or more replicas; and restoring the first data object from the one or more of the available replicas, based on the calculated restoration-priority.
Abstract:
Embodiments relate to data shuffling by logically rotating processing nodes. The nodes are logically arranged in a two or three dimensional matrix. Every time two of the nodes in adjacent rows of the matrix are positionally aligned, these adjacent nodes exchange data. The positional alignment is a logical alignment of the nodes. The nodes are logically arranged and rotated, and data is exchanged in response to the logical rotation.
Abstract:
Several different embodiments of a segmented object storage system are described. The object storage system divides files into a number of object segments, each segment corresponding to a portion of the object, and stores each segment individually in the cloud storage system. The system also generates and stores a manifest file describing the relationship of the various segments to the original data file. Requests to retrieve the segmented file are fulfilled by consulting the manifest file and using the information from the manifest to reconstitute the original data file from the constituent segments. Modifying, appending to, or truncating the object is accomplished by manipulating individual segments and the manifest file. In further embodiments, manipulation of the individual object segments and/or the manifest is used to implement copy-on-write, snapshotting, software transactional memory, and peer-to-peer transmission of the large file.
Abstract:
One embodiment of the present invention provides a fault-management system. During operation, the system identifies a failure at a remote location associated with a communication service. The system then determines a local port used for the communication service, and suspends the local port, thereby allowing the failure to be detected by a device coupled to the local port.
Abstract:
Systems and methods which provide mount catalogs to facilitate rapid volume mount are shown. A mount catalog of embodiments may be provided for each aggregate containing volumes to be mounted by a takeover node of a storage system. The mount catalog may comprise a direct storage level, such as a DBN level, based mount catalog. Such mount catalogs may be maintained in a reserved portion of the storage devices containing a corresponding aggregate and volumes, wherein the storage device reserved portion is known to a takeover node. In operation according to embodiments, a HA pair takeover node uses a mount catalog to access the blocks used to mount volumes of a HA pair partner node prior to a final determination that the partner node is in fact a failed node and prior to onlining the aggregate containing the volumes.
Abstract:
A method, non-transitory computer readable medium, and apparatus that monitors with a passive storage controller a plurality of active storage controllers. A determination is made with the passive storage controller when a failure of one of the active storage controllers has occurred based on the monitoring. Storage device(s) previously assigned to the one of the active storage controllers are remapped to the passive storage controller. A transaction log associated with the one of the active storage controllers is retrieved with the passive storage controller from a transaction log database. Transaction(s) in the transaction log are replayed with the passive storage controller, when the failure of the one of the active storage controllers is determined to have occurred.
Abstract:
An intercluster repository synchronizer and method for synchronizing objects are disclosed. An example intercluster repository synchronizer includes an information processing system, including a processor, computer-readable medium, and network device. The intercluster repository synchronizer includes a structured information repository on the computer-readable medium. The structured information repository contains a plurality of records corresponding to a selected group of stored information objects. The intercluster repository synchronizer further includes a synchronization indicator that stores an address associated with a remote replication target. The intercluster repository synchronizer also includes a replicator, operable to send a message using the network device to the replication target responsive to changes in the structured information repository, and further operable to receive a message that a plurality of stored information objects have been duplicated at the remote replication target. The duplicated stored information objects are selected based on a shared metadata indicator stored in the structured information repository.
Abstract:
The present invention aims at providing a storage system capable of shortening the recovery time from failure while ensuring the reliability of data when failure occurs to a storage device. When failure occurs to a storage device, a recovery processing corresponding to the content of failure is executed for the blocked storage device. The storage device recovered via the execution of the recovery processing is subjected to a check corresponding to the operation status of the storage system or the failure history of the storage device.
Abstract:
A printed circuit card (1) comprising a first connection interface (11) configured to manage a first interconnection (10) with said card (1), said first interconnection (10) including a plurality of links; a second connection interface (13) configured to manage a second interconnection (20) with said card (1); the first connection interface (11) being further configured to detect the occurrence of a breakdown in a link of the first interconnection (10); the second connection interface being further configured to share the information of the occurrence of the breakdown; to select a fallback solution from among a list of fallback solutions; to delete the selected fallback solution once it is applied; the processor being further configured to apply the selected fallback solution to the first interconnection; to reinitialize the first interconnection.