EFFICIENT RECOVERY OF DEDUPLICATION DATA FOR HIGH CAPACITY SYSTEMS

    公开(公告)号:US20180253254A1

    公开(公告)日:2018-09-06

    申请号:US15830345

    申请日:2017-12-04

    Applicant: Tintri Inc.

    Abstract: Efficient recovery of deduplication data for high capacity systems is disclosed, including: reading from the data storage device a data structure that tracks a plurality of segments to which a plurality of persistent objects have been recently written, wherein segments are written to in a monotonically increasing numerical order; selecting a checkpoint segment from among the plurality of segments based at least in part on a plurality of segment numbers corresponding to respective ones of the plurality of segments; using the checkpoint segment and a segment associated with a latest available segment number to determine a set of candidate segments; reading at least a portion of the set of candidate segments to identify a data storage block for which a corresponding deduplication data entry is not already stored in persistently stored deduplication data entries; and storing a new deduplication data entry to insert a fingerprint associated with the data storage block in a current data structure stored in a memory.

Patent Agency Ranking