Methods and systems for methods and systems for in-line deduplication in a distributed storage system
Abstract:
A method for deduplicating data comprising: obtaining, from a metadata node and by file system client executing on a client application node, a data layout; generating, by the client application node, a fingerprint for the data stored on the client application node; generating, by a memory hypervisor module executing on the client application node, at least one input/output (I/O) request specifying a location in a storage pool, wherein the location is determined using the data layout; issuing, by the memory hypervisor module, the at least one I/O request to the storage pool, wherein processing the at least one I/O request results in at least a portion of the data being stored at the location; and after issuing the at least one I/O request to the storage pool, transmitting the fingerprint to the metadata node, wherein the metadata node attempts to deduplicate the data using the fingerprint.
Information query
Patent Agency Ranking
0/0