Abstract:
Exemplary methods, apparatuses, and systems include a file system process obtaining locks on a first node and a second node in a tree structure, with the second node being a child node of the first node. The file system process determines a quantity of child nodes of the second. While holding the locks on the first and second nodes, the file system determines whether to proactively split or merge the second node. In response to determining that the quantity of child nodes is within a first range, the file system process splits the second node. If the file system process determines that the quantity of child nodes is within a second range, the file system process merges the second node.
Abstract:
A buffer tree structure includes, at each internal node, a buffer having a compacted portion and an uncompacted portion. Insertion of data elements to the buffer tree can occur units called packets. A packet is initially stored in the uncompacted portion of a receiving node's buffer. When a compaction trigger condition exists, packet compaction is performed including a data element compaction operation. A buffer-emptying (flush) operation pushes the compacted packets to children nodes.
Abstract:
Techniques for enabling deduplication-aware load balancing in a distributed storage system are provided. In one set of embodiments, a node of the distributed storage system can receive an I/O (Input/Output) request pertaining to a data block of a storage object stored on a local storage component of the node. The node can further determine whether the I/O request requires insertion of a new entry into a deduplication hash table associated with the local storage component or deletion of an existing entry from the deduplication hash table. If the I/O request requires insertion of a new hash table entry, the node can add an identifier of the data block into a probabilistic data structure associated with the local storage component, where the probabilistic data structure is configured to maintain information regarding distinct data blocks that are likely present in the local storage component. Alternatively, if the I/O request requires deletion of an existing hash table entry, the node can remove the identifier of the data block from the probabilistic data structure.
Abstract:
In accordance with the present disclosure, files may be deduplicated in a distributed storage system having a plurality of storage volumes. A uniqueness metric for each file may indicate a degree of deduplication of the respective data files in the given storage volume. The uniqueness metric may be used to identify files for rebalancing in the distributed storage system. The uniqueness metric may be efficiently calculated with enough accuracy using a sampling methodology.
Abstract:
Techniques for enabling deduplication-aware load balancing in a distributed storage system are provided. In one set of embodiments, a node of the distributed storage system can receive an I/O (Input/Output) request pertaining to a data block of a storage object stored on a local storage component of the node. The node can further determine whether the I/O request requires insertion of a new entry into a deduplication hash table associated with the local storage component or deletion of an existing entry from the deduplication hash table. If the I/O request requires insertion of a new hash table entry, the node can add an identifier of the data block into a probabilistic data structure associated with the local storage component, where the probabilistic data structure is configured to maintain information regarding distinct data blocks that are likely present in the local storage component. Alternatively, if the I/O request requires deletion of an existing hash table entry, the node can remove the identifier of the data block from the probabilistic data structure.
Abstract:
Embodiments described herein involve improved management of snapshots of a file system. Embodiments include copying a first root node of a first snapshot to a second snapshot, the second snapshot referencing other nodes of the first snapshot. Embodiments further include incrementing reference counts of the other nodes of the first snapshot. Embodiments further include adding a storage address of the first root node to a list. Embodiments further include, each time that a copy on write operation is performed for a node of the other nodes, adding a storage address of the node to the list and decrementing the reference count of the node. Embodiments further include iterating through the list and, for each storage address in the list, decrementing the reference count of the node corresponding to the storage address and, if the reference count of the node reaches zero, freeing storage space at the storage address.
Abstract:
The subject matter described herein is generally directed towards tracking and recovering a disk allocation state. An on-disk log of operations is maintained to describe operations performed to an in-memory partial reference count map. Upon a crash of a host computing device during a checkpoint operation to an on-disk complete reference count map, the on-disk log of operations is used to undo and then redo the operations, or just redo the operations. In this manner, a disk allocation state prior to the crash is recreated in the on-disk complete reference count map with atomicity and crash consistency.
Abstract:
Techniques for implementing data deduplication in conjunction with thick and thin provisioning of storage objects are provided. In one embodiment, a system can receive a write request directed to a storage object stored by the system and can determine whether the storage object is a thin or thick object. If the storage object is a thin object, the system can calculate a usage value by adding a total amount of physical storage space used in the system to a total amount of storage space reserved for thick storage objects in the system and further subtracting a total amount of reserved storage space for the thick storage objects that are filled with unique data. The system can then reject the write request if the usage value is not less than the total storage capacity of the system.
Abstract:
A deduplication storage system with snapshot and clone capability includes storing logical pointer objects and organizing a first set of the logical pointer objects into a hierarchical structure. A second set of the logical pointer objects may be associated with corresponding logical data blocks of a client data object. The second set of the logical pointer objects may point to physical data blocks having deduplicated data that comprise data of the corresponding logical data blocks. Some of the logical pointer objects in the first set may point to the logical pointer objects in the second set, so that the hierarchical structure represents the client data object. A root of the hierarchical structure may be associated with the client data object. A snapshot or clone may be created by making a copy of the root and associating the copied root with the snapshot or clone.
Abstract:
A method includes obtaining a plurality of representations corresponding respectively to a plurality of blocks of data stored on a source node. A plurality of data pairs are sent to a destination node, where each data pair includes a logical address associated with a block of data from the plurality of blocks of data and the corresponding representation of the block of data. A determination is made whether the blocks of data associated with the respective logical addresses are duplicates of data stored on the destination node. In accordance with an affirmative determination, a reference to a physical address of the block of data stored on the destination node is stored. In accordance with a negative determination, an indication that the data corresponding to the respective logical address is not a duplicate is stored. The data indicated as not being a duplicate is written to the destination node.