Distributed data storage system using erasure coding on storage nodes fewer than data plus parity fragments

    公开(公告)号:US11614883B2

    公开(公告)日:2023-03-28

    申请号:US17336103

    申请日:2021-06-01

    Abstract: A distributed data storage system using erasure coding (EC) provides advantages of EC data storage while retaining high resiliency for EC data storage architectures having fewer data storage nodes than the number of EC data-plus-parity fragments. An illustrative embodiment is a three-node data storage system with EC 4+2. Incoming data is temporarily replicated to ameliorate the effects of certain storage node outages or fatal disk failures, so that read and write operations can continue from/to the storage system. The system is equipped to automatically heal failed EC write attempts in a manner transparent to users and/or applications: when all storage nodes are operational, the distributed data storage system automatically converts the temporarily replicated data to EC storage and reclaims storage space previously used by the temporarily replicated data. Individual hardware failures are healed through migration techniques that reconstruct and re-fragment data blocks according to the governing EC scheme.

    Decommissioning, re-commissioning, and commissioning new metadata nodes in a working distributed data storage system

    公开(公告)号:US11570243B2

    公开(公告)日:2023-01-31

    申请号:US17465691

    申请日:2021-09-02

    Abstract: In a running distributed data storage system that actively processes I/Os, metadata nodes are commissioned and decommissioned without taking down the storage system and without introducing interruptions to metadata or payload data I/O. The inflow of reads and writes continues without interruption even while new metadata nodes are in the process of being added and/or removed and the strong consistency of the system is guaranteed. Commissioning and decommissioning nodes within the running system enables streamlined replacement of permanently failed nodes and advantageously enables the system to adapt elastically to workload changes. An illustrative distributed barrier logic (the “view change barrier”) controls a multi-state process that controls a coordinated step-wise progression of the metadata nodes from an old view to a new normal. Rules for I/O handling govern each state until the state machine loop has been traversed and the system reaches its new normal.

    In-flight data encryption/decryption for a distributed storage platform

    公开(公告)号:US11470056B2

    公开(公告)日:2022-10-11

    申请号:US17066316

    申请日:2020-10-08

    Abstract: Encryption of data occurs before it is written to the storage platform; decryption occurs after it is read from the storage platform on a computer separate from the storage platform. By encrypting data before it travels over a wide-area network to a storage platform (and by only decrypting that data once it has arrived at an enterprise from the storage platform), we address data security over the network. Application data is encrypted at the virtual disk level before it leaves a controller virtual machine, and is only decrypted at that controller virtual machine after being received from the storage platform. Encryption and decryption of data is compatible with other services of the storage system such as de-duplication. Any number of key management services can be used in a transparent manner.

    IN-FLIGHT DATA ENCRYPTION/DECRYPTION FOR A DISTRIBUTED STORAGE PLATFORM

    公开(公告)号:US20210029095A1

    公开(公告)日:2021-01-28

    申请号:US17066316

    申请日:2020-10-08

    Abstract: Encryption of data occurs before it is written to the storage platform; decryption occurs after it is read from the storage platform on a computer separate from the storage platform. By encrypting data before it travels over a wide-area network to a storage platform (and by only decrypting that data once it has arrived at an enterprise from the storage platform), we address data security over the network. Application data is encrypted at the virtual disk level before it leaves a controller virtual machine, and is only decrypted at that controller virtual machine after being received from the storage platform. Encryption and decryption of data is compatible with other services of the storage system such as de-duplication. Any number of key management services can be used in a transparent manner.

    In-flight data encryption/decryption for a distributed storage platform

    公开(公告)号:US11916886B2

    公开(公告)日:2024-02-27

    申请号:US17941929

    申请日:2022-09-09

    CPC classification number: H04L63/0428 G06F9/45558 H04L63/062 H04L67/1097

    Abstract: Encryption of data occurs before it is written to the storage platform; decryption occurs after it is read from the storage platform on a computer separate from the storage platform. By encrypting data before it travels over a wide-area network to a storage platform (and by only decrypting that data once it has arrived at an enterprise from the storage platform), we address data security over the network. Application data is encrypted at the virtual disk level before it leaves a controller virtual machine, and is only decrypted at that controller virtual machine after being received from the storage platform. Encryption and decryption of data is compatible with other services of the storage system such as de-duplication. Any number of key management services can be used in a transparent manner.

    Anti-entropy-based metadata recovery in a strongly consistent distributed data storage system

    公开(公告)号:US11789830B2

    公开(公告)日:2023-10-17

    申请号:US17465722

    申请日:2021-09-02

    Abstract: A strongly consistent distributed data storage system comprises an enhanced metadata service that is capable of fully recovering all metadata that goes missing when a metadata-carrying disk, disks, and/or partition fail. An illustrative recovery service runs automatically or on demand to bring the metadata node back into full service. Advantages of the recovery service include guaranteed full recovery of all missing metadata, including metadata still residing in commit logs, without impacting strong consistency guarantees of the metadata. The recovery service is network-traffic efficient. In preferred embodiments, the recovery service avoids metadata service downtime at the metadata node, thereby reducing the impact of metadata disk failure on the availability of the system. The disclosed metadata recovery techniques are said to be “self-healing” as they do not need manual intervention and instead automatically detect failures and automatically recover from the failures in a non-disruptive manner.

    Optimized deduplication based on backup frequency in a distributed data storage system

    公开(公告)号:US11693572B2

    公开(公告)日:2023-07-04

    申请号:US17710600

    申请日:2022-03-31

    Abstract: Disclosed deduplication techniques at a distributed data storage system guarantee that space reclamation will not affect deduplicated data integrity even without perfect synchronization between components. By understanding certain “behavioral” characteristics and schedule cadences of backup operations that generate backup copies received at the distributed data storage system, data blocks that are not re-written by subsequent backup copies are pro-actively aged, while promoting continued retention of data blocks that are re-written. An expiry scheme operates with block-level granularity. Each unique deduplicated data block is given an expiry timeframe based on the block's arrival time at the distributed data storage system (i.e., when a backup copy supplies the block) and further based on backup frequencies of the various virtual disks referencing a unique system-wide identifier of the block, which is based on the block's hash value. Communications between components are kept to an as-needed basis. Cloud-based and multi-cloud configurations are disclosed.

Patent Agency Ranking