Scale out chunk store to multiple nodes to allow concurrent deduplication
Abstract:
The present disclosure provides techniques for scaling out deduplication of files among a plurality of nodes. The techniques include designating a master component for the coordination of deduplication. The master component divides files to be deduplicated among several slave nodes, and provides to each slave node a set of unique identifiers that are to be assigned to chunks during the deduplication process. The techniques herein preserve integrity of the deduplication process that has been scaled out among several nodes. The scaled out deduplication process deduplicates files faster by allowing several deduplication modules to work in parallel to deduplicate files.
Information query
Patent Agency Ranking
0/0