Invention Grant
- Patent Title: Disk-image deduplication with hash subset in memory
-
Application No.: US15877566Application Date: 2018-01-23
-
Publication No.: US10552075B2Publication Date: 2020-02-04
- Inventor: Oleg Zaydman
- Applicant: VMWARE, INC.
- Applicant Address: US CA Palo Alto
- Assignee: VMware, Inc.
- Current Assignee: VMware, Inc.
- Current Assignee Address: US CA Palo Alto
- Agency: Fish & Richardson P.C.
- Main IPC: G06F3/06
- IPC: G06F3/06 ; G06F16/11 ; G06F9/455

Abstract:
Deduplication of virtual-machine disk images and other disk images can involve identifying the first clusters in a file. The clusters are hashed. The first-in-file hashes (generated from first-in-file clusters) are stored in an in-memory index, while the full set of hashes is streamed in order to find matches with the hashes stored in the in-memory index. First-in-file hashes in the stream are compared, while other hashes in the stream are compared only if the immediately preceding hash resulted in a match. Comparing non-first-in-file hashes requires disk accesses, but since such comparisons are conditioned on first-in-file matches, there are relatively likely to result in sequences of matches. The net effect is a relatively fast deduplication with compression approaching that resulting from a full comparison of all hashes.
Public/Granted literature
- US20190227726A1 DISK-IMAGE DEDUPLICATION WITH HASH SUBSET IN MEMORY Public/Granted day:2019-07-25
Information query