Abstract:
Methods, non-transitory computer readable media, and computing devices that receive data from a primary storage node. The data is stored in a primary volume within a primary composite aggregate hosted by the primary storage node. A determination is made when the data is tagged to indicate that the data is stored in the primary volume on a remote data storage device of the primary composite aggregate. The data is stored on another remote data storage device without storing the data in a local data storage device, when the determining indicates that the data is tagged to indicate that the data is stored in the primary volume on a remote data storage device of the primary composite aggregate. Accordingly, this technology allows data placement to remain consistent across primary and secondary volumes and facilitates efficient operation of secondary storage nodes by eliminating two-phase writes for data stored on cloud storage devices.
Abstract:
With a forever incremental snapshot configuration and a typical caching policy (e.g., least recently used), a storage appliance may evict stable data blocks of an older snapshot, perhaps unchanged data blocks of the snapshot baseline. If stable data blocks have been evicted, restore of a recent snapshot will suffer the time penalty of downloading the stable blocks for restoring the recent snapshot. Creating synthetic baseline snapshots and refreshing eviction data of stable data blocks can avoid eviction of stable data blocks and reduce the risk of violating a recovery time objective.
Abstract:
One or more techniques and/or computing devices are provided for resilient replication of storage operations. For example, a first storage controller may host first storage having a replication relationship with second storage hosted by a second storage controller. To improve resiliency against transient network issues of a network between the storage controllers, the first storage controller may implement a queue and retry mechanism to retry replication operations not acknowledge back by the second storage controller within a threshold time. The second storage controller may maintain a cumulative sequence number of a latest replication operation performed in order, an operation response map of replication operations performed out of order, and an operation finder map identifying currently implemented replication operations, which may be used to process incoming replication operations. Single write semantics, write order consistency, and reduction of write amplification may be provided.
Abstract:
One or more techniques and/or computing devices are provided for synchronous replication. For example, synchronous replication relationships are established between a first storage object (e.g., a file, a logical unit number (LUN), a consistency group, etc.), hosted by a first storage controller, and a plurality of replication storage objects hosted by other storage controllers. In this way, a write operation to the first storage object is implemented in parallel upon the first storage object and the replication storage objects in a synchronous manner, such as using a zero-copy operation to reduce overhead otherwise introduced by performing copy operations. Reconciliation is performed in response to a failure so that the first storage object and the replication storage objects comprise consistent data. Failed write operations and replication write operations are retried, while enforcing a single write semantic. Dependent write order consistency is enforced for dependent write operations, such as overlapping write operations.
Abstract:
One or more techniques and/or computing devices are provided for implementing synchronous replication. For example, a synchronous replication relationship may be established between a first storage controller hosting local storage and a second storage controller hosting remote storage (e.g., replication may be specified at a file, logical unit number (LUN), or any other level of granularity). Data operations and offloaded operations may be implemented in parallel upon the local storage and the remote storage. Error handling operations may be implemented upon the local storage and implement in parallel as a best effort on the remote storage, and a reconciliation may be performed to identify any data divergence from the best effort parallel implementation. Storage area network (SAN) operations may be implemented upon the local storage, and upon local completion may be remotely implemented upon the remote storage.
Abstract:
Disclosed are systems, computer-readable mediums, and methods for transforming data in a file system. As part of a recycling process, a determination is made that transformations should be attempted. A data block is determined to be in use by at least one user of the storage system. If a transformation should be attempted on the data block is determined. Parameters associated with the performance of the file system can be used in this determination. A type of transformation to be done is determined. The data block is transformed based upon the selected transformation. The transformed data block is written to the storage system. As part of the recycling process, the transformation requires no additional input/output requests.
Abstract:
A system, method, and machine-readable storage medium for recovering data in a distributed storage system are provided. In some embodiments, the method includes identifying a failing storage device of a first storage node having an inaccessible data segment. When it is determined that the inaccessible data segment cannot be recovered using a first data protection scheme, a first chunk of data associated with the inaccessible data segment is identified and a group associated with the first chunk of data is identified. A second chunk of data associated with the group is selectively retrieved from a second storage node such that data associated with an accessible data segment of the first storage node is not retrieved. The inaccessible data segment is recovered by recovering the first chunk of data using a second data protection scheme and the second chunk of data.
Abstract:
First partial baseline data of a first storage system is identified. First changed data of the first storage system is identified. The first changed data comprises data that has changed since a previous point in time. First backup data is written to a second storage system. The first backup data comprises the first partial baseline data and the first changed data. After writing the first backup data to the second storage system, second partial baseline data of the first storage system is identified. The second partial baseline data does not include the first partial baseline data. Second changed data of the first storage system is identified. The second changed data comprises data that has changed since writing the first backup data. Second backup data is written to the second storage system. The second backup data comprises the second partial baseline data and the second changed data.
Abstract:
Methods and systems for managing caching mechanisms in storage systems are provided where a global cache management function manages multiple independent cache pools and a global cache pool. As an example, the method includes: splitting a cache storage into a plurality of independently operating cache pools, each cache pool comprising storage space for storing a plurality of cache blocks for storing data related to an input/output ("I/O") request and metadata associated with each cache pool; receiving the I/O request for writing a data; operating a hash function on the I/O request to assign the I/O request to one of the plurality of cache pools; and writing the data of the I/O request to one or more of the cache blocks associated with the assigned cache pool. In an aspect, this allows efficient I/O processing across multiple processors simultaneously.
Abstract:
A configuration for a component of a primary node is synchronized with a configuration for a component of a partner node in a different cluster by replicating the primary node configuration with the partner node. A baseline configuration replication comprises a snapshot of a component configuration on the primary. The baseline configuration can be generated by traversing through the configuration objects, capturing their attributes and encapsulating them in a package. The baseline package can then be transferred to the partner node. The configuration objects can be applied on the partner node in the order in which they were captured on the primary node. Attributes of the configuration objects are identified that are to be transformed. Values for the identified attributes are transformed from a name space in the primary node to a name space in the partner node.