Abstract:
A storage appliance arranges snapshot data and snapshot metadata into different structures, and arranges the snapshot metadata to facilitate efficient snapshot manipulation, which may be for snapshot management or snapshot restore. The storage appliance receives snapshots according to a forever incremental configuration and arranges snapshot metadata into different types of records. The storage appliance stores these records in key-value stores maintained for each defined data collection (e.g., volume). The storage appliance arranges the snapshot metadata into records for inode information, records for directory information, and records that map source descriptors of data blocks to snapshot file descriptors. The storage appliance uses a locally generated snapshot identifier as a key prefix for the records to conform to a sort constrain of the key-value store, which allows the efficiency of the key-value store to be leveraged. The snapshot metadata arrangement facilitates efficient snapshot restore, file restore, and snapshot reclamation.
Abstract:
With a forever incremental snapshot configuration and a typical caching policy (e.g., least recently used), a storage appliance may evict stable data blocks of an older snapshot, perhaps unchanged data blocks of the snapshot baseline. If stable data blocks have been evicted, restore of a recent snapshot will suffer the time penalty of downloading the stable blocks for restoring the recent snapshot. Creating synthetic baseline snapshots and refreshing eviction data of stable data blocks can avoid eviction of stable data blocks and reduce the risk of violating a recovery time objective.
Abstract:
A method, non-transitory computer readable medium, and device that assists with managing cloud storage includes identifying a portion of data in a data unit identified for deletion in the metadata. The identified portion of the data identified for delete is compare to a threshold amount. Deletion of the data unit from a first storage object is deferred when the determined portion of data identified for deletion is less than the threshold amount. A second storage object with a portion of data unmarked for deletion in the data unit is generated when the determined portion of data marked for deletion is equal to the threshold amount, wherein the second storage object has a same identifier as the first storage object.
Abstract:
A storage appliance can be designed to facilitate efficient restore of multiple backed up files in a system that allows files to share data blocks. A data management application or storage OS names data blocks and communicates those names to the storage appliance when backing up to or through the storage appliance. The storage appliance can leverage the data block names when restoring a group of files by restoring at data block granularity instead of file granularity. Restoring at the granularity of the data blocks by their names allows the storage appliance to avoid repeatedly sending a same data block to the restore requestor (e.g., a storage OS or data management application) while still instructing the restore requestor how to reconstruct the corresponding file(s) with mappings between valid data ranges and the named data blocks.
Abstract:
A method, non-transitory computer readable medium, and device that assists with managing cloud storage includes identifying a portion of data in a data unit identified for deletion in the metadata. The identified portion of the data identified for delete is compare to a threshold amount. Deletion of the data unit from a first storage object is deferred when the determined portion of data identified for deletion is less than the threshold amount. A second storage object with a portion of data unmarked for deletion in the data unit is generated when the determined portion of data marked for deletion is equal to the threshold amount, wherein the second storage object has a same identifier as the first storage object.
Abstract:
Embodiments address the problem of disk fragmentation by using the heuristics of write operations to assign block sizes. As write requests are received, a storage system may register a size of the write request. Using the registered sizes, the storage system may identify one or more clusters of sizes at which write requests are particularly prevalent. The storage system may calculate a distribution or variance for block sizes centered on each cluster. The distribution or variance may be used to distribute the block sizes such that the block sizes change by a small amount in the vicinity of the cluster, and by a larger amount as the blocks move away from the center of the cluster. When it comes time to allocate new blocks, the clusters and distribution may be consulted to determine what sizes of blocks to allocate, and how many blocks of each size.