HIERARCHICAL IDENTIFICATION AND MAPPING OF DUPLICATE DATA IN A STORAGE SYSTEM

    公开(公告)号:WO2012173858A3

    公开(公告)日:2012-12-20

    申请号:PCT/US2012/041297

    申请日:2012-06-07

    Abstract: The technique introduced here includes a system and method for identifying and mapping duplicate data objects referenced by data objects. The technique illustratively utilizes a hierarchical tree of fingerprints for each data object to compare the data objects and identify duplicate data blocks referenced by the data objects. A progressive comparison of the hierarchical trees starts from a top layer of the hierarchical trees and proceeds toward a base layer. Between the compared data objects (i.e., the compared hierarchical trees), the technique maps matching fingerprints only at the top-most layer of the hierarchical trees at which the fingerprints match. Lower layer matching fingerprints are neither compared nor mapped. Data blocks corresponding to the matching fingerprints are then deleted. Such an identification and mapping technique substantially reduces the amount of mapping metadata stored in data objects that have been subject to deduplication.

    MIGRATING DEDUPLICATED DATA
    2.
    发明申请
    MIGRATING DEDUPLICATED DATA 审中-公开
    迁移重复数据

    公开(公告)号:WO2014063073A1

    公开(公告)日:2014-04-24

    申请号:PCT/US2013/065715

    申请日:2013-10-18

    Applicant: NETAPP, INC.

    Abstract: Methods and apparatuses for efficiently migrating deduplicated data are provided. In one example, a data management system includes a data storage volume, a memory including machine executable instructions, and a computer processor. The data storage volume includes data objects and free storage space. The computer processor executes the instructions to perform deduplication of the data objects and determine migration efficiency metrics for groups of the data objects. Determining the migration efficiency metrics includes determining, for each group, a relationship between the free storage space that will result if the group is migrated from the volume and the resources required to migrate the group from the volume.

    Abstract translation: 提供了有效迁移重复数据删除数据的方法和设备。 在一个示例中,数据管理系统包括数据存储卷,包括机器可执行指令的存储器和计算机处理器。 数据存储卷包括数据对象和空闲存储空间。 计算机处理器执行指令以执行数据对象的重复数据删除,并确定数据对象组的迁移效率度量。 确定迁移效率指标包括为每个组确定如果组从卷迁移而导致的空闲存储空间与从组中迁移组所需的资源之间的关系。

    OBJECT-LEVEL IDENTIFICATION OF DUPLICATE DATA IN A STORAGE SYSTEM
    3.
    发明申请
    OBJECT-LEVEL IDENTIFICATION OF DUPLICATE DATA IN A STORAGE SYSTEM 审中-公开
    存储系统中双重数据的对象级别标识

    公开(公告)号:WO2012173859A2

    公开(公告)日:2012-12-20

    申请号:PCT/US2012/041301

    申请日:2012-06-07

    CPC classification number: G06F17/30156

    Abstract: The technique introduced here includes a system and method for identification of duplicate data directly at a data-object level. The technique illustratively utilizes a hierarchical tree of fingerprints for each data object to compare data objects and identify duplicate data blocks referenced by the data objects. The hierarchical fingerprint trees are constructed in such a manner that a top-level fingerprint (or object-level fingerprint) of the hierarchical tree is representative of all data blocks referenced by a storage system. In embodiments, inline techniques are utilized to generate hierarchical fingerprints for new data objects as they are created, and an object-level fingerprint of the new data object is compared against preexisting object-level fingerprints in the storage system to identify exact or close matches. While exact matches result in complete deduplication of data blocks referenced by the data object, hierarchical comparison methods are used for identifying and mapping duplicate data blocks referenced by closely related data objects.

    Abstract translation: 这里介绍的技术包括直接在数据对象层面识别重复数据的系统和方法。 该技术说明性地利用每个数据对象的指纹分层树来比较数据对象并识别由数据对象引用的重复数据块。 层次化指纹树以这样一种方式构成,使得层次树的顶级指纹(或对象级指纹)代表由存储系统引用的所有数据块。 在实施例中,在创建新数据对象时,使用内联技术来生成新数据对象的分层指纹,并将新数据对象的对象级指纹与存储系统中的预先存在的对象级指纹进行比较,以识别精确或近似的匹配。 虽然精确匹配导致数据对象引用的数据块的完全重复数据删除,但层次比较方法用于识别和映射由紧密相关的数据对象引用的重复数据块。

    HIERARCHICAL IDENTIFICATION AND MAPPING OF DUPLICATE DATA IN A STORAGE SYSTEM
    4.
    发明公开
    HIERARCHICAL IDENTIFICATION AND MAPPING OF DUPLICATE DATA IN A STORAGE SYSTEM 审中-公开
    递阶辨识和转让的重复数据在存储系统

    公开(公告)号:EP2721496A2

    公开(公告)日:2014-04-23

    申请号:EP12801019.6

    申请日:2012-06-07

    Applicant: NetApp, Inc.

    CPC classification number: G06F17/30156

    Abstract: The technique introduced here includes a system and method for identifying and mapping duplicate data objects referenced by data objects. The technique illustratively utilizes a hierarchical tree of fingerprints for each data object to compare the data objects and identify duplicate data blocks referenced by the data objects. A progressive comparison of the hierarchical trees starts from a top layer of the hierarchical trees and proceeds toward a base layer. Between the compared data objects (i.e., the compared hierarchical trees), the technique maps matching fingerprints only at the top-most layer of the hierarchical trees at which the fingerprints match. Lower layer matching fingerprints are neither compared nor mapped. Data blocks corresponding to the matching fingerprints are then deleted. Such an identification and mapping technique substantially reduces the amount of mapping metadata stored in data objects that have been subject to deduplication.

    OBJECT-LEVEL IDENTIFICATION OF DUPLICATE DATA IN A STORAGE SYSTEM
    5.
    发明公开
    OBJECT-LEVEL IDENTIFICATION OF DUPLICATE DATA IN A STORAGE SYSTEM 审中-公开
    识别重复数据上。在存储系统中的对象层级

    公开(公告)号:EP2721495A2

    公开(公告)日:2014-04-23

    申请号:EP12800807.5

    申请日:2012-06-07

    Applicant: Netapp, Inc.

    CPC classification number: G06F17/30156

    Abstract: The technique introduced here includes a system and method for identification of duplicate data directly at a data-object level. The technique illustratively utilizes a hierarchical tree of fingerprints for each data object to compare data objects and identify duplicate data blocks referenced by the data objects. The hierarchical fingerprint trees are constructed in such a manner that a top-level fingerprint (or object-level fingerprint) of the hierarchical tree is representative of all data blocks referenced by a storage system. In embodiments, inline techniques are utilized to generate hierarchical fingerprints for new data objects as they are created, and an object-level fingerprint of the new data object is compared against preexisting object-level fingerprints in the storage system to identify exact or close matches. While exact matches result in complete deduplication of data blocks referenced by the data object, hierarchical comparison methods are used for identifying and mapping duplicate data blocks referenced by closely related data objects.

Patent Agency Ranking