CACHING CONTENT ADDRESSABLE DATA CHUNKS FOR STORAGE VIRTUALIZATION

    公开(公告)号:WO2014159781A3

    公开(公告)日:2014-10-02

    申请号:PCT/US2014/025111

    申请日:2014-03-12

    Abstract: The subject disclosure is directed towards using primary data deduplication concepts for more efficient access of data via content addressable caches. Chunks of data, such as deduplicated data chunks, are maintained in a fast access client-side cache, such as containing chunks based upon access patterns. The chunked content is content addressable via a hash or other unique identifier of that content in the system. When a chunk is needed, the client-side cache (or caches) is checked for the chunk before going to a file server for the chunk. The file server may likewise maintain content addressable (chunk) caches. Also described are cache maintenance, management and organization, including pre-populating caches with chunks, as well as using RAM and/or solid-state storage device caches.

    USING INDEX PARTITIONING AND RECONCILIATION FOR DATA DEDUPLICATION
    2.
    发明申请
    USING INDEX PARTITIONING AND RECONCILIATION FOR DATA DEDUPLICATION 审中-公开
    使用索引分割和调和进行数据重传

    公开(公告)号:WO2012092212A2

    公开(公告)日:2012-07-05

    申请号:PCT/US2011/067292

    申请日:2011-12-23

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index is partitioned into subspace indexes, with less than the entire hash index service's index cached to save memory. The subspace index is accessed to determine whether a data chunk already exists or needs to be indexed and stored. The index may be divided into subspaces based on criteria associated with the data to index, such as file type, data type, time of last usage, and so on. Also described is subspace reconciliation in which duplicate entries in subspaces are detected so as to remove entries and chunks from the deduplication system. Subspace reconciliation may be performed at off-peak time when more system resources are available, and may be interrupted if resources are needed. Subspaces to reconcile may be based on similarity, including via similarity of signatures that each compactly represents the subspace's hashes.

    Abstract translation: 本主题公开内容针对一种重复数据删除技术,其中将散列索引服务的索引划分为子空间索引,并且缓存整个散列索引服务的索引以节省存储空间。 子空间索引被访问以确定数据块是否已经存在或需要被索引和存储。 索引可根据与要索引的数据相关的条件划分为子空间,如文件类型,数据类型,上次使用时间等。 还描述了子空间协调,其中检测子空间中的重复条目以从重复删除系统中删除条目和块。 当有更多的系统资源可用时,可以在非高峰时间执行子空间对帐,并且如果需要资源,可能会中断子空间对帐。 要调和的子空间可能基于相似性,包括通过每个紧凑地表示子空间散列的签名的相似性。

    CREDIT-BASED PEER-TO-PEER STORAGE
    3.
    发明申请
    CREDIT-BASED PEER-TO-PEER STORAGE 审中-公开
    基于信用的对等存储

    公开(公告)号:WO2009002835A2

    公开(公告)日:2008-12-31

    申请号:PCT/US2008/067647

    申请日:2008-06-20

    Abstract: Distributed computing devices comprising a system for sharing computing resources can provide shared computing resources to users having sufficient resource credits. A user can earn resource credits by reliably offering a computing resource for sharing for a predetermined amount of time. The conversion rate between the amount of credits awarded, and the computing resources provided by a user can be varied to maintain balance within the system, and to foster beneficial user behavior. Once earned, the credits can be used to fund the user's account, joint accounts which include the user and others, or others' accounts that do not provide any access to the user. Computing resources can be exchanged on a peer-to-peer basis, though a centralized mechanism can link relevant peers together. To verify integrity, and protect against maliciousness, offered resources can be periodically tested.

    Abstract translation: 包括用于共享计算资源的系统的分布式计算设备可以向具有足够资源信用的用户提供共享的计算资源。 用户可以通过可靠地提供用于共享预定时间量的计算资源来获得资源信用。 可以改变授予的学分数量和用户提供的计算资源之间的转换率,以保持系统内的平衡,并促进有益的用户行为。 一旦获得,信用额可以用于为用户的帐户,包括用户和其他人的联合账户或不提供对用户的访问的其他账户提供资金。 计算资源可以在对等的基础上交换,尽管集中的机制可以将相关的对等体链接在一起。 为了验证完整性,并防止恶意,提供的资源可以定期测试。

    ERASURE CODING ACROSS MULTIPLE ZONES
    4.
    发明申请
    ERASURE CODING ACROSS MULTIPLE ZONES 审中-公开
    在多个区域进行擦除编码

    公开(公告)号:WO2014209993A1

    公开(公告)日:2014-12-31

    申请号:PCT/US2014/043857

    申请日:2014-06-24

    Abstract: In various embodiments, methods and systems for erasure coding data across multiple storage zones are provided. This may be accomplished by dividing a data chunk into a plurality of sub-fragments. Each of the plurality of sub-fragments is associated with a zone. Zones comprise buildings, data centers, and geographic regions providing a storage service. A plurality of reconstruction parities is computed. Each of the plurality of reconstruction parities computed using at least one sub-fragment from the plurality of sub-fragments. The plurality of reconstruction parities comprises at least one cross-zone parity. The at least one cross-zone parity is assigned to a parity zone. The cross-zone parity provides cross-zone reconstruction of a portion of the data chunk.

    Abstract translation: 在各种实施例中,提供用于擦除跨多个存储区域的编码数据的方法和系统。 这可以通过将数据块划分成多个子片段来实现。 多个子片段中的每一个与区域相关联。 区域包括提供存储服务的建筑物,数据中心和地理区域。 计算多个重建奇偶校验。 使用来自多个子片段的至少一个子片段来计算多个重建奇偶校验中的每一个。 多个重建奇偶校验包括至少一个跨区域奇偶校验。 至少一个跨区奇偶校验被分配给奇偶校验区。 跨区域奇偶校验提供了一部分数据块的跨区域重建。

    PREDICTING DATA COMPRESSIBILITY USING DATA ENTROPY ESTIMATION
    5.
    发明申请
    PREDICTING DATA COMPRESSIBILITY USING DATA ENTROPY ESTIMATION 审中-公开
    使用数据熵估计预测数据的可压缩性

    公开(公告)号:WO2014133982A1

    公开(公告)日:2014-09-04

    申请号:PCT/US2014/018129

    申请日:2014-02-25

    CPC classification number: H03M7/30 H03M7/3091

    Abstract: The subject disclosure is directed towards predicting compressibility of a data block, and using the predicted compressibility in determining whether a data block if compressed will be sufficiently compressible to justify compression. In one aspect, data of the data block is processed to obtain an entropy estimate of the data block, e.g., based upon distinct value estimation. The compressibility prediction may be used in conjunction with a chunking mechanism of a data deduplication system.

    Abstract translation: 本发明涉及预测数据块的可压缩性,并且在确定数据块是否被压缩时将使用预测的可压缩性将足够可压缩以证明压缩。 在一个方面,数据块的数据被处理以获得数据块的熵估计,例如基于不同的值估计。 压缩性预测可以与重复数据删除系统的分块机制结合使用。

    GLOBAL TRAFFIC MANAGEMENT USING MODIFIED HOSTNAME
    6.
    发明申请
    GLOBAL TRAFFIC MANAGEMENT USING MODIFIED HOSTNAME 审中-公开
    使用修改主机名的全球交通管理

    公开(公告)号:WO2012145181A2

    公开(公告)日:2012-10-26

    申请号:PCT/US2012/032636

    申请日:2012-04-06

    Abstract: A particular method includes receiving a request from a client at a server and sending a global traffic management identifier in response to the request from the client. The global traffic management identifier is determined based on an attribute of the client. In response to the client requesting access to a service based on a modified hostname of the service, a data center associated with the service is identified based on the modified hostname of the service. The modified hostname identifies the global traffic management identifier, and the identified data center is useable by the client to access the service.

    Abstract translation: 特定方法包括从服务器的客户端接收请求,并响应于来自客户端的请求发送全局流量管理标识符。 基于客户端的属性来确定全局流量管理标识符。 响应于客户端基于服务的修改的主机名请求对服务的访问,基于服务的修改的主机名来识别与该服务相关联的数据中心。 修改的主机名标识全局流量管理标识符,并且所识别的数据中心可由客户端访问该服务。

    ADAPTIVE INDEX FOR DATA DEDUPLICATION
    7.
    发明申请

    公开(公告)号:WO2012092348A3

    公开(公告)日:2012-07-05

    申请号:PCT/US2011/067544

    申请日:2011-12-28

    Abstract: The subject disclosure is directed towards a data deduplication technology in which a hash index service's index and/or indexing operations are adaptable to balance deduplication performance savings, throughput and resource consumption. The indexing service may employ hierarchical chunking using different levels of granularity corresponding to chunk size, a sampled compact index table that contains compact signatures for less than all of the hash index's (or subspace's) hash values, and/or selective subspace indexing based on similarity of a subspace's data to another subspace's data and/or to incoming data chunks.

    QUALITY OF SERVICE (QOS) BASED SYSTEMS, NETWORKS, AND ADVISORS

    公开(公告)号:WO2011056364A3

    公开(公告)日:2011-05-12

    申请号:PCT/US2010/052313

    申请日:2010-10-12

    Abstract: Techniques and technologies for routing communications based on Quality of Service (QOS) related information. More particularly, this document discloses techniques and technologies for selecting communications paths which partially overlap other communication paths for which QOS related information has been measured. The techniques and technologies include determining, performance levels for path segments within the communication paths from the measured QOS information.

    DISTRIBUTED DATA STORAGE USING ERASURE RESILIENT CODING
    10.
    发明申请
    DISTRIBUTED DATA STORAGE USING ERASURE RESILIENT CODING 审中-公开
    分布式数据存储使用ERASURE RESILIENT编码

    公开(公告)号:WO2008157081A2

    公开(公告)日:2008-12-24

    申请号:PCT/US2008/066084

    申请日:2008-06-06

    CPC classification number: G06F11/1076 G06F2211/1028

    Abstract: An erasure resilient coding (ERC) distributed data storage system and method for storing data in a reliable and survivable fashion while minimizing hardware and associated costs. The system and method includes forming multiple protection groups both within and across storage nodes of the storage system. Data is segmented into original data blocks and ERC data blocks. Load balancing occurs by interleaving storage nodes with equal numbers of original data blocks and ERC data blocks while ensuring each node has an equal number of combined read and write operations. Unique read and write operations on data block can be performed independent of other data blocks in a protection group. The write operation uses Galois field arithmetic and ERC transform to either write or append a new data block to a storage node. The read operation recovers data in a variety of ways using ERC decoding.

    Abstract translation: 一种擦除弹性编码(ERC)分布式数据存储系统和方法,用于以可靠和可行的方式存储数据,同时最小化硬件和相关成本。 该系统和方法包括在存储系统的存储节点内部和之间形成多个保护组。 数据被分割成原始数据块和ERC数据块。 通过在具有相等数量的原始数据块和ERC数据块的交织存储节点的同时确保每个节点具有相等数目的组合读和写操作来实现负载平衡。 可以独立于保护组中的其他数据块执行对数据块的独特读写操作。 写操作使用伽罗瓦域算术和ERC变换来向存储节点写入或附加新的数据块。 读取操作使用ERC解码以各种方式恢复数据。

Patent Agency Ranking