Moving Window Data Deduplication in Distributed Storage

    公开(公告)号:US20230376470A1

    公开(公告)日:2023-11-23

    申请号:US18226314

    申请日:2023-07-26

    Applicant: Google LLC

    Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

    Moving Window Data Deduplication in Distributed Storage

    公开(公告)号:US20220365914A1

    公开(公告)日:2022-11-17

    申请号:US17876660

    申请日:2022-07-29

    Applicant: Google LLC

    Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

    Moving window data deduplication in distributed storage

    公开(公告)号:US11442911B2

    公开(公告)日:2022-09-13

    申请号:US17007495

    申请日:2020-08-31

    Applicant: Google LLC

    Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

    Moving window data deduplication in distributed storage

    公开(公告)号:US11762821B2

    公开(公告)日:2023-09-19

    申请号:US17876660

    申请日:2022-07-29

    Applicant: Google LLC

    Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

    Moving Window Data Deduplication in Distributed Storage

    公开(公告)号:US20220067006A1

    公开(公告)日:2022-03-03

    申请号:US17007495

    申请日:2020-08-31

    Applicant: Google LLC

    Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

Patent Agency Ranking