-
公开(公告)号:GB2518158A
公开(公告)日:2015-03-18
申请号:GB201316177
申请日:2013-09-11
Applicant: IBM
Inventor: HAUSTEIN NILS , CHRIST ACHIM , WINARSKI DANIEL J , MUELLER-WICKE DOMINIC
Abstract: A storage infrastructure 100 comprises a host system 102 connected to at least two storage systems 115A, 115B, and a de-duplication module 101. The de-duplication module contains a table 127 comprising multiple entries, each including a hash value, a data location, an identifier, and usage counts in each storage system for an individual data chunk. A write request for storing a data chunk in one of the storage systems is funnelled through the module using a hash value of the chunk; if an entry for the hash value is present in the table, an entry for the location of the chunk is written to a reference table in the storage system; if no entry is present, one is created and the chunk is stored in the storage system. Where the usage count for the chunk in that storage system exceeds a specified value, the chunk is stored again. This reduces the risk of a bottleneck in read requests for a particular data chunk.