Invention Grant
- Patent Title: Digest based data matching in similarity based deduplication
-
Application No.: US13941694Application Date: 2013-07-15
-
Publication No.: US10296598B2Publication Date: 2019-05-21
- Inventor: Lior Aronovich
- Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agency: Griffiths & Seaton PLLC
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Data matches are calculated between input data and repository data via a digest based matching algorithm where the reference digests corresponding to a repository interval of data identified as similar to an input interval of data are loaded into a sequential array and into a search structure. Each of the matching digests found using the search structure are extended using the sequential array of reference digests. Repository data intervals are determined as similar to an input data interval. Reference digests corresponding to the similar repository data interval are loaded into a sequential representation and into a search structure. Matches of input digests and the reference digests are found using the search structure. Each one of the found matches of the input digests and repository digests are extended using the sequential representation. Data matches are determined between the input data and the repository data using extended matches of digests.
Public/Granted literature
- US20150019499A1 DIGEST BASED DATA MATCHING IN SIMILARITY BASED DEDUPLICATION Public/Granted day:2015-01-15
Information query