Abstract:
Change tracking for multiphase deduplication. In one example embodiment, a method of tracking changes to a source storage for multiphase deduplication includes a change tracking phase. The change tracking phase includes performing a hash function on each allocated block in a source storage that is changed between a prior point in time and a subsequent point in time, and tracking, in a change log, the location in the source storage of each changed block and the corresponding hash value. The hash function calculates a hash value corresponding to the changed block.
Abstract:
Hash collision recovery in a deduplication vault. In one example embodiment, a method for hash collision recovery in a deduplication vault includes creating first parity data for all unique blocks of a source storage at a point in time. The first parity data includes both the unique blocks as well as an order of block positions of the unique blocks as stored in the source storage. Next, a hash value is generated for each of the unique blocks. Then, a backup is stored in a deduplication vault including each of the unique blocks together with its corresponding hash value. Next, second parity data is created for all of the unique blocks of the backup. Then, the first parity data is compared to the second parity data to determine whether one or more hash collisions occurred resulting in one or more missing unique blocks.
Abstract:
Change tracking for multiphase deduplication. In one example embodiment, a method of tracking changes to a source storage of a source system for multiphase deduplication includes a change tracking phase that includes performing various steps for only allocated blocks in the source storage that are changed between a prior point in time and a subsequent point in time. These steps include temporarily storing a copy of the changed block in a volatile memory of the source system prior to writing the changed block to the source storage, performing a hash function only once on the copy of the changed block, while the copy is temporarily stored in a volatile memory of the source system, to calculate a hash value, writing the changed block to the source storage, and tracking, in a change log, a location in the source storage of the changed block and the corresponding hash value.
Abstract:
Local seeding of a restore storage for restoring a backup from a remote deduplication vault storage. In one example embodiment, a method of local seeding of a restore storage for restoring a backup from a remote deduplication vault storage includes determining which blocks included in a backup of a source storage at a point in time, which is stored in the remote vault storage, are available in a local seeded storage containing common blocks, reading the locally available blocks from the local seeded storage, reading the non-locally available blocks from the remote vault storage, and storing the read blocks in the restore storage in the same position as stored in the source storage at the point in time. The remote vault storage is remote from the restore storage and the local seeded storage is local to the restore storage.
Abstract:
Defragmentation during multiphase deduplication. In one example embodiment, a method of defragmentation during multiphase deduplication includes an analysis phase that includes analyzing each allocated block stored in a source storage at a point in time to determine if the block is duplicated in a vault storage, a defragmentation phase that includes reordering the duplicate blocks stored in the source storage to match the order of the duplicate blocks as stored in the vault storage, and a backup phase that is performed after completion of the defragmentation phase and that includes storing, in the vault storage, each unique nonduplicate block from the source storage.
Abstract:
Avoiding encryption in a deduplication vault. In one example embodiment, a method for avoiding encryption during a backup of a source storage into a deduplication storage may include analyzing an allocated plain text block stored in the source storage at the point in time to determine if the allocated plain text block is already duplicated in the deduplication storage, in response to the allocated plain text block already being duplicated in the deduplication storage, avoiding encryption of the allocated plain text block by skipping an encryption of the allocated plain text block and instead associating the location of the allocated plain text block in the source storage with the location of the duplicate block already duplicated in the deduplication storage.
Abstract:
Avoiding encryption in a deduplication vault. In one example embodiment, a method may include analyzing an allocated plain text block stored in the source storage to determine if the block is already stored in the deduplication storage, in response to the block not being stored, encrypting the allocated plain text block and analyzing the encrypted block to determine if the encrypted block is already stored in the deduplication storage, analyzing a second allocated plain text block stored in the source storage to determine if the block is already stored in the deduplication storage, in response to the block already being stored, avoiding encryption of the second allocated plain text block by not encrypting the second allocated plain text block and instead associating the location of the second allocated plain text block in the source storage with the location of the duplicate block already stored.
Abstract:
Avoiding encryption of certain blocks in a deduplication vault. In one example embodiment, a method of avoiding encryption of certain blocks during a backup of a source storage into a deduplication vault storage may include analyzing each allocated plain text block stored in a source storage at a point in time to determine if the allocated plain text block is already stored in the deduplication vault storage. If the allocated plain text block is not stored in the deduplication vault storage, the block may be encrypted and the encrypted block may be analyzed to determine if the encrypted block is already stored in the deduplication vault storage. If neither the allocated plain text block nor the encrypted block is already stored in the deduplication vault storage, the encrypted block may be stored in the deduplication vault storage.
Abstract:
Hash collision recovery in a deduplication vault. In one example embodiment, a method for hash collision recovery in a deduplication vault includes creating first parity data for all unique blocks of a source storage at a point in time. The first parity data includes both the unique blocks as well as an order of block positions of the unique blocks as stored in the source storage. Next, a hash value is generated for each of the unique blocks. Then, a backup is stored in a deduplication vault including each of the unique blocks together with its corresponding hash value. Next, second parity data is created for all of the unique blocks of the backup. Then, the first parity data is compared to the second parity data to determine whether one or more hash collisions occurred resulting in one or more missing unique blocks. Next, responsive to the one or more hash collisions occurring, the first parity data is used to recover the one or more missing unique blocks. Then, the backup is restored.
Abstract:
Avoiding encryption of certain blocks in a deduplication vault. In one example embodiment, a method of avoiding encryption of certain blocks during a backup of a source storage into a deduplication vault storage may include analyzing each allocated plain text block stored in a source storage at a point in time to determine if the allocated plain text block is already stored in the deduplication vault storage. If the allocated plain text block is not stored in the deduplication vault storage, the block may be encrypted and the encrypted block may be analyzed to determine if the encrypted block is already stored in the deduplication vault storage. If neither the allocated plain text block nor the encrypted block is already stored in the deduplication vault storage, the encrypted block may be stored in the deduplication vault storage.