Global de-duplication of virtual disks in a storage platform

    公开(公告)号:US11733930B2

    公开(公告)日:2023-08-22

    申请号:US17707077

    申请日:2022-03-29

    CPC classification number: G06F3/0664 G06F3/0608 G06F3/0641 G06F3/0683

    Abstract: In order to avoid writing duplicates of blocks of data into a storage platform, any virtual disk within the storage platform may have a de-duplication feature enabled. Or, all virtual disks have this feature enabled. For virtual disks with de-duplication enabled, a unique message digest is calculated for every block of data written to that virtual disk. Upon a write, these message digests are consulted in order to determine if a particular block of data has already been written, if so, it is not written again, and if not, it is written. All de-duplication virtual disks are written to a single system virtual disk within the storage platform. De-duplication occurs over the entire storage platform and over all its virtual disks because all message digests are consulted before a write is performed for any virtual disk. A read for a de-duplication virtual desk reads from the system virtual disk.

    Optimized deduplication based on backup frequency in a distributed data storage system

    公开(公告)号:US11513708B2

    公开(公告)日:2022-11-29

    申请号:US17153667

    申请日:2021-01-20

    Abstract: Disclosed deduplication techniques at a distributed data storage system guarantee that space reclamation will not affect deduplicated data integrity even without perfect synchronization between components. By understanding certain “behavioral” characteristics and schedule cadences of backup operations that generate backup copies received at the distributed data storage system, data blocks that are not re-written by subsequent backup copies are pro-actively aged, while promoting continued retention of data blocks that are re-written. An expiry scheme operates with block-level granularity. Each unique deduplicated data block is given an expiry timeframe based on the block's arrival time at the distributed data storage system (i.e., when a backup copy supplies the block) and further based on backup frequencies of the various virtual disks referencing a unique system-wide identifier of the block, which is based on the block's hash value. Communications between components are kept to an as-needed basis. Cloud-based and multi-cloud configurations are disclosed.

    Synchronizing metadata in a data storage platform comprising multiple computer nodes

    公开(公告)号:US11500821B2

    公开(公告)日:2022-11-15

    申请号:US16919630

    申请日:2020-07-02

    Abstract: A client machine writes to a virtual disk on a remote storage platform. Metadata is generated and stored in replicas on different nodes of the storage platform. A modified log-structured merge tree is used to store and compact string-sorted tables of metadata. During file storage and compaction, a consistent file identification scheme is used across all metadata nodes. A fingerprint file is calculated for each SST (metadata) file on disk that includes hash values corresponding to regions of the SST file. To synchronize, the fingerprint files of two SST files are compared, and if any hash values are missing from a fingerprint file then the key-value-timestamp triplets corresponding to these missing hash values are sent to the SST file that is missing them. The SST file is compacted with the missing triplets to create a new version of the SST file. The synchronization is bi-directional as between distinct computer nodes.

    GLOBAL DE-DUPLICATION OF VIRTUAL DISKS IN A STORAGE PLATFORM

    公开(公告)号:US20220222017A1

    公开(公告)日:2022-07-14

    申请号:US17707077

    申请日:2022-03-29

    Abstract: In order to avoid writing duplicates of blocks of data into a storage platform, any virtual disk within the storage platform may have a de-duplication feature enabled. Or, all virtual disks have this feature enabled. For virtual disks with de-duplication enabled, a unique message digest is calculated for every block of data written to that virtual disk. Upon a write, these message digests are consulted in order to determine if a particular block of data has already been written, if so, it is not written again, and if not, it is written. All de-duplication virtual disks are written to a single system virtual disk within the storage platform. De-duplication occurs over the entire storage platform and over all its virtual disks because all message digests are consulted before a write is performed for any virtual disk. A read for a de-duplication virtual desk reads from the system virtual disk.

    Global de-duplication of virtual disks in a storage platform

    公开(公告)号:US12093575B2

    公开(公告)日:2024-09-17

    申请号:US18205448

    申请日:2023-06-02

    CPC classification number: G06F3/0664 G06F3/0608 G06F3/0641 G06F3/0683

    Abstract: In order to avoid writing duplicates of blocks of data into a storage platform, any virtual disk within the storage platform may have a de-duplication feature enabled. Or, all virtual disks have this feature enabled. For virtual disks with de-duplication enabled, a unique message digest is calculated for every block of data written to that virtual disk. Upon a write, these message digests are consulted in order to determine if a particular block of data has already been written, if so, it is not written again, and if not, it is written. All de-duplication virtual disks are written to a single system virtual disk within the storage platform. De-duplication occurs over the entire storage platform and over all its virtual disks because all message digests are consulted before a write is performed for any virtual disk. A read for a de-duplication virtual desk reads from the system virtual disk.

    IN-FLIGHT DATA ENCRYPTION/DECRYPTION FOR A DISTRIBUTED STORAGE PLATFORM

    公开(公告)号:US20230014437A1

    公开(公告)日:2023-01-19

    申请号:US17941929

    申请日:2022-09-09

    Abstract: Encryption of data occurs before it is written to the storage platform; decryption occurs after it is read from the storage platform on a computer separate from the storage platform. By encrypting data before it travels over a wide-area network to a storage platform (and by only decrypting that data once it has arrived at an enterprise from the storage platform), we address data security over the network. Application data is encrypted at the virtual disk level before it leaves a controller virtual machine, and is only decrypted at that controller virtual machine after being received from the storage platform. Encryption and decryption of data is compatible with other services of the storage system such as de-duplication. Any number of key management services can be used in a transparent manner.

    GLOBAL DE-DUPLICATION OF VIRTUAL DISKS IN A STORAGE PLATFORM

    公开(公告)号:US20210004181A1

    公开(公告)日:2021-01-07

    申请号:US17028164

    申请日:2020-09-22

    Abstract: In order to avoid writing duplicates of blocks of data into a storage platform, any virtual disk within the storage platform may have a de-duplication feature enabled. Or, all virtual disks have this feature enabled. For virtual disks with de-duplication enabled, a unique message digest is calculated for every block of data written to that virtual disk. Upon a write, these message digests are consulted in order to determine if a particular block of data has already been written, if so, it is not written again, and if not, it is written. All de-duplication virtual disks are written to a single system virtual disk within the storage platform. De-duplication occurs over the entire storage platform and over all its virtual disks because all message digests are consulted before a write is performed for any virtual disk. A read for a de-duplication virtual desk reads from the system virtual disk.

Patent Agency Ranking