Abstract:
Opportunistic repair of fragmentation in a synthetic backup is disclosed. In various embodiments, data generated to perform processing other than fragmentation repair is received. At least a portion of the received data is used to compute a locality measure with respect to a group of segments comprising a portion of a file. A decision whether to repair fragmentation of segments comprising the group is made based at least in part on the computed locality measure.
Abstract:
Selective repair of fragmentation in a synthetic backup, based at least in part on a dynamically-determined repair criteria, is disclosed. In various embodiments, a locality measure is computed with respect to a group of segments comprising a portion of a file. The computed locality measure is compared to an at least partly dynamically determined fragmentation repair criteria, and a repair decision is made based at least in part on the comparison.
Abstract:
An indication is received that a data object is to be deleted, wherein the data object comprises data stored in a segment within a container. It is determined no currently alive data object references any segment within the container. The container is placed in a delete-ready but not yet reclaimable state.
Abstract:
Techniques for sanitizing a storage system are described herein. In one embodiment, for each of fingerprints representing data chunks stored in a first container of the storage system, a lookup operation in a live bit vector based on the fingerprint is performed to determine whether a corresponding data chunk is live. In one embodiment, a bit in a copy bit vector corresponding to the data chunk is populated based on the lookup operation. In one embodiment, after all of the bits corresponding to the data chunks of the first container have been populated in the CBV, data chunks represented by the CBV are copied from the first container to a second container, and records of the data chunks in the first container are erased.
Abstract:
A computer-implemented method is disclosed. The method starts with determining a first container of a storage system is invalid. The method continues with the storage system setting a data recovery state for the first container to be en-queue, which indicates that data of at least one of the data segments needs to be recovered from the first container, and executing a process to recover any container having an en-queue data recovery state, and for each of the containers, to recover any valid data segment from the corresponding container. The process includes scanning the data segments of the first container to find valid data segments, moving or replicating the valid data segments to a second container, and setting the data recovery state for the first container to be complete once all the valid data segments are moved or replicated to the second container.
Abstract:
An indication is received that a data object is to be deleted, wherein the data object comprises data stored in a segment within a container. It is determined no currently alive data object references any segment within the container. The container is placed in a delete-ready but not yet reclaimable state.
Abstract:
A garbage collector of a storage system traverses a namespace of a file system of the storage system to identify segments that are alive in a breadth-first manner. The namespace includes information identifying files that are represented by segments arranged in a plurality of levels in a hierarchy, where an upper level segment includes one or more references to one or more lower level segments, and at least one segment is referenced by multiple files. All live segments of an upper level are identified before any of live segments of a lower level are identified. Upon all live segments of all levels have been identified, the live segments are copied from their original storage locations to a new storage location, and a storage space associated with the original storage locations is reclaimed.
Abstract:
Selective repair of fragmentation in a synthetic backup, based at least in part on a dynamically-determined repair criteria, is disclosed. In various embodiments, a locality measure is computed with respect to a group of segments comprising a portion of a file. The computed locality measure is compared to an at least partly dynamically determined fragmentation repair criteria, and a repair decision is made based at least in part on the comparison.
Abstract:
Stream locality delta compression is disclosed. A previous stream indicated locale of data segments is selected. A first data segment is then determined to be similar to a data segment in the stream indicated locale.
Abstract:
In one embodiment, in response to a request received from a client for retrieving a data object stored in a storage system, a root key is obtained from the request. The data object is represented by metadata in a hierarchical structure having a plurality of levels. Each level includes a plurality of nodes and each node being one of a root node, a leaf node and an intermediate node. The hierarchical structure of metadata associated with the data object is traversed in a top-down approach to decrypt each of a plurality of nodes in the hierarchical structure using a key provided from its parent node, starting from the root node to the leaf nodes, including decrypting the root node using the root key. Decrypted data associated with the plurality of nodes is transmitted to the client.