Invention Grant
- Patent Title: System and method for data deduplication for disk storage subsystems
-
Application No.: US13182669Application Date: 2011-07-14
-
Publication No.: US09678688B2Publication Date: 2017-06-13
- Inventor: John W. Bates
- Applicant: John W. Bates
- Applicant Address: US MA Hopkinton
- Assignee: EMC IP Holding Company LLC
- Current Assignee: EMC IP Holding Company LLC
- Current Assignee Address: US MA Hopkinton
- Agency: Muirhead and Saturnelli, LLC
- Main IPC: G06F3/06
- IPC: G06F3/06 ; G06F17/30 ; H04N19/63 ; G06F11/14

Abstract:
A method for data deduplication includes the following steps. First, segmenting an original data set into a plurality of data segments. Next, transforming the data in each data segment into a transformed data representation that has a band-type structure for each data segment. The band-type structure includes a plurality of bands. Next, selecting a first set of bands, grouping them together and storing them with the original data set. The first set of bands includes non-identical transformed data for each data segment. Next, selecting a second set of bands and grouping them together. The second set of bands includes identical transformed data for each data segment. Next, applying a hash function onto the transformed data of the second set of bands and thereby generating transformed data segments indexed by hash function indices. Finally, storing the hash function indices and the transformed data representation of one representative data segment in a deduplication database.
Public/Granted literature
- US20120016845A1 SYSTEM AND METHOD FOR DATA DEDUPLICATION FOR DISK STORAGE SUBSYSTEMS Public/Granted day:2012-01-19
Information query