Invention Grant
- Patent Title: Extensible pipeline for data deduplication
- Patent Title (中): 用于重复数据删除的可扩展管道
-
Application No.: US12970839Application Date: 2010-12-16
-
Publication No.: US08380681B2Publication Date: 2013-02-19
- Inventor: Paul Adrian Oltean , Ran Kalach , Ahmed M. El-Shimi , James Robert Benton
- Applicant: Paul Adrian Oltean , Ran Kalach , Ahmed M. El-Shimi , James Robert Benton
- Applicant Address: US WA Redmond
- Assignee: Microsoft Corporation
- Current Assignee: Microsoft Corporation
- Current Assignee Address: US WA Redmond
- Agency: Gonzalez Saggio & Harlan LLP
- Main IPC: G06F17/00
- IPC: G06F17/00

Abstract:
The subject disclosure is directed towards data deduplication (optimization) performed by phases/modules of a modular data deduplication pipeline. At each phase, the pipeline allows modules to be replaced, selected or extended, e.g., different algorithms can be used for chunking or compression based upon the type of data being processed. The pipeline facilitates secure data processing, batch processing, and parallel processing. The pipeline is tunable based upon feedback, e.g., by selecting modules to increase deduplication quality, performance and/or throughput. Also described is selecting, filtering, ranking, sorting and/or grouping the files to deduplicate, e.g., based upon properties and/or statistical properties of the files and/or a file dataset and/or internal or external feedback.
Public/Granted literature
- US20120158672A1 Extensible Pipeline for Data Deduplication Public/Granted day:2012-06-21
Information query