Speeding de-duplication using a temporal digest cache
Abstract:
Embodiments are directed to techniques for implementing a deduplication system that minimizes disk accesses to an on-disk digest log when deduplicating consecutively-stored data. These techniques for performing deduplication utilize an in-memory temporal digest cache. When the on-disk digest log is accessed for a set of data and a match is found, the temporal digest cache is written with digests not only for the set of data but also for other data stored in a temporal relationship with the set of data. This temporal digest cache allows subsequent deduplication of temporally-related data to proceed faster without needing to repeatedly access the digest log on disk.
Information query
Patent Agency Ranking
0/0