Block level deduplication with block similarity
Abstract:
Methods and apparatus are provided for block similarity based block level deduplication of data. An exemplary method comprises obtaining a deduplicated dataset comprising a plurality of unique data chunks; determining a number of differences between two of the unique data chunks; evaluating whether the number of differences satisfies a predefined similarity criteria (e.g., that the number of bit differences cannot exceed a specified limit); and storing metadata for a first one of the two unique data chunks if the predefined similarity criteria is satisfied for the two unique data chunks, wherein the metadata comprises a pointer to a second one of the two unique data chunks and bit differences between the two unique data chunks. The bit differences comprise an executable code and/or a bit mask. The predefined similarity threshold is optionally a tunable parameter. The first one of the two unique data chunks can be restored by processing the metadata.
Public/Granted literature
Information query
Patent Agency Ranking
0/0