Patent search ap:("Google LLC") AND inv:"Pavan Edara" Page 3

21.

发明授权
Scalable exactly-once data processing using transactional streaming writes 有权

公开(公告)号：US11573876B2

公开(公告)日：2023-02-07

申请号：US17085576

申请日：2020-10-30

Applicant: Google LLC

Inventor： Pavan Edara , Reuven Lax , Yi Yang , Gurpreet Singh Nanda

IPC: G06F11/14 , G06F11/30 , G06F9/30 , G06F9/46 , G06F11/07 , G06F12/02

Abstract: A method for processing data exactly once using transactional stream writes includes receiving, from a client, a batch of data blocks for storage on memory hardware in communication with the data processing hardware. The batch of data blocks is associated with a corresponding sequence number and represents a number of rows of a table stored on the memory hardware. The method also includes partitioning the batch of data blocks into a plurality of sub-batches of data blocks. For each sub-batch of data blocks, the method further includes assigning the sub-batch of data blocks to a buffered stream; writing, using the assigned buffered stream, the sub-batch of data blocks to the memory hardware; updating a storage log with an intent to commit the sub-batch of data blocks using the assigned buffered stream; and committing the sub-batch of data blocks to the memory hardware.

22.

发明申请
Execution-Time Dynamic Range Partitioning Transformations 有权

公开(公告)号：US20220358142A1

公开(公告)日：2022-11-10

申请号：US17814694

申请日：2022-07-25

Applicant: Google LLC

Inventor： Seyed Omid Fatemieh , Mikhail Entin , Adrian Baras , Pavan Edara , Aleksandras Surna

IPC: G06F16/27 , G06F16/23

Abstract: An example method includes receiving a data load request requesting loading and partitioning of an unknown quantity of user data for storage at a data storage system. The user data including a partitioning key; a total data size of the user data; a plurality of rows, each row of the plurality of rows associated with a value defined by the partitioning key; and one or more columns. The method also includes identifying one or more storage constraints for the data storage system. The method further includes, after receiving the user data, determining a plurality of partitioning quantiles defining respective ranges of values of the partitioning key based on the user data and the one or more storage constraints for the data storage system; and range partitioning each row of the user data into files based on the value associated with the row defined by the partitioning key, and the respective ranges of the values of the partitioning key defined by the plurality of partitioning quantiles.

23.

发明申请
Synchronous Replication Of High Throughput Streaming Data 有权

公开(公告)号：US20220155972A1

公开(公告)日：2022-05-19

申请号：US17098306

申请日：2020-11-13

Applicant: Google LLC

Inventor： Pavan Edara , Jonathan Forbes

IPC: G06F3/06 , G06F11/30 , G06F11/14

Abstract: A method for synchronous replication of stream data includes receiving a stream of data blocks for storage at a first storage location associated with a first geographical region and at a second storage location associated with a second geographical region. The method also includes synchronously writing the stream of data blocks to the first storage location and to the second storage location. While synchronously writing the stream of data blocks, the method includes determining an unrecoverable failure at the second storage location. The method also includes determining a failure point in the writing of the stream of data blocks that demarcates data blocks that were successfully written and not successfully written to the second storage location. The method also includes synchronously writing, starting at the failure point, the stream of data blocks to the first storage location and to a third storage location associated with a third geographical region.

24.

发明申请
Moving Window Data Deduplication in Distributed Storage 有权

公开(公告)号：US20220067006A1

公开(公告)日：2022-03-03

申请号：US17007495

申请日：2020-08-31

Applicant: Google LLC

Inventor： Pavlo Padinker , Pavan Edara , Bigang Li

IPC: G06F16/215 , G06F16/22 , G06F16/23 , G06F12/0804

Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

25.

发明申请
Zero Copy Optimization for SELECT * Queries 有权

公开(公告)号：US20210357404A1

公开(公告)日：2021-11-18

申请号：US17315281

申请日：2021-05-08

Applicant: Google LLC

Inventor： Pavan Edara , Jordan Tigani

IPC: G06F16/2453 , G06F16/22 , G06F16/242

Abstract: A computer-implemented method includes receiving a query specifying an operation to perform on a first table of a plurality of data blocks stored. Each data block in the first table includes a respective reference count indicating a number of tables referencing the data block. The method also includes determining that the operation specified by the query includes copying the plurality of data blocks in the first table into a second table and, in response, for each data block of the plurality of data blocks in the first table copied into the second table, incrementing, the respective reference count associated with the data block in the first table, appending, by the data processing hardware, into metadata of the second table, a reference of the corresponding data block copied into the second table.

26.

发明授权
Execution-time dynamic range partitioning transformations 有权

公开(公告)号：US12158898B2

公开(公告)日：2024-12-03

申请号：US17814694

申请日：2022-07-25

Applicant: Google LLC

Inventor： Seyed Omid Fatemieh , Mikhail Entin , Adrian Baras , Pavan Edara , Aleksandras Surna

IPC: G06F16/27 , G06F16/23

Abstract: An example method includes receiving a data load request requesting loading and partitioning of an unknown quantity of user data for storage at a data storage system. The user data including a partitioning key; a total data size of the user data; a plurality of rows, each row of the plurality of rows associated with a value defined by the partitioning key; and one or more columns. The method also includes identifying one or more storage constraints for the data storage system. The method further includes, after receiving the user data, determining a plurality of partitioning quantiles defining respective ranges of values of the partitioning key based on the user data and the one or more storage constraints for the data storage system; and range partitioning each row of the user data into files based on the value associated with the row defined by the partitioning key, and the respective ranges of the values of the partitioning key defined by the plurality of partitioning quantiles.

27.

发明公开
SCALABLE EXACTLY-ONCE DATA PROCESSING USING TRANSACTIONAL STREAMING WRITES 审中-公开

公开(公告)号：US20240143469A1

公开(公告)日：2024-05-02

申请号：US18391229

申请日：2023-12-20

Applicant: Google LLC

Inventor： Pavan Edara , Reuven Lax , Ji Yang , Gurpreet Singh Nanda

IPC: G06F11/30 , G06F9/30 , G06F9/46 , G06F11/07 , G06F11/14 , G06F12/02

CPC classification number: G06F11/3034 , G06F9/30047 , G06F9/467 , G06F11/0757 , G06F11/0772 , G06F11/1402 , G06F12/0246 , G06F12/0253 , G06F2201/84

Abstract: A method for processing data exactly once using transactional stream writes includes receiving, from a client, a batch of data blocks for storage on memory hardware in communication with the data processing hardware. The batch of data blocks is associated with a corresponding sequence number and represents a number of rows of a table stored on the memory hardware. The method also includes partitioning the batch of data blocks into a plurality of sub-batches of data blocks. For each sub-batch of data blocks, the method further includes assigning the sub-batch of data blocks to a buffered stream; writing, using the assigned buffered stream, the sub-batch of data blocks to the memory hardware; updating a storage log with an intent to commit the sub-batch of data blocks using the assigned buffered stream; and committing the sub-batch of data blocks to the memory hardware.

28.

发明公开
Moving Window Data Deduplication in Distributed Storage 审中-公开

公开(公告)号：US20230376470A1

公开(公告)日：2023-11-23

申请号：US18226314

申请日：2023-07-26

Applicant: Google LLC

Inventor： Pavlo Padinker , Pavan Edara , Bigang Li

IPC: G06F16/215 , G06F16/22 , G06F16/23 , G06F12/0804

CPC classification number: G06F16/215 , G06F16/2282 , G06F16/2322 , G06F16/235 , G06F12/0804

Abstract: The present disclosure describes a service which provides primary in-line deduplication. A streaming application program interface (API) may allow for streaming records into a storage system with high throughput and low latency. As part of this process, the API allows user to add identifiers as a field used for data deduplication. The deduplication service keeps a moving window of the identifiers in memory and does in-line deduplication by quickly determining whether data is a duplicate. Keeping only deduplication keys in memory reduces the cost of running the service. Moreover, the real-time nature of the moving window approach allows for storing deduplication information alongside the data and accessing it immediately on read. In this regard, read after write consistency is supported, and costs are reduced.

29.

发明公开
Zero Copy Optimization for SELECT * Queries 审中-公开

公开(公告)号：US20230229657A1

公开(公告)日：2023-07-20

申请号：US18185925

申请日：2023-03-17

Applicant: Google LLC

Inventor： Pavan Edara , Jordan Tigani

IPC: G06F16/2453 , G06F16/22 , G06F16/242

CPC classification number: G06F16/24535 , G06F16/2282 , G06F16/2445

Abstract: A computer-implemented method includes receiving a query specifying an operation to perform on a first table of a plurality of data blocks stored. Each data block in the first table includes a respective reference count indicating a number of tables referencing the data block. The method also includes determining that the operation specified by the query includes copying the plurality of data blocks in the first table into a second table and, in response, for each data block of the plurality of data blocks in the first table copied into the second table, incrementing, the respective reference count associated with the data block in the first table, appending, by the data processing hardware, into metadata of the second table, a reference of the corresponding data block copied into the second table.

30.

发明公开
Synchronous Replication Of High Throughput Streaming Data 审中-公开

公开(公告)号：US20230195331A1

公开(公告)日：2023-06-22

申请号：US18166834

申请日：2023-02-09

Applicant: Google LLC

Inventor： Pavan Edara , Jonathan Forbes

IPC: G06F3/06 , G06F11/14 , G06F11/30

CPC classification number: G06F3/0619 , G06F3/064 , G06F3/065 , G06F3/067 , G06F3/0635 , G06F3/0653 , G06F3/0659 , G06F11/1435 , G06F11/1446 , G06F11/1471 , G06F11/3034

Abstract: A method for synchronous replication of stream data includes receiving a stream of data blocks for storage at a first storage location associated with a first geographical region and at a second storage location associated with a second geographical region. The method also includes synchronously writing the stream of data blocks to the first storage location and to the second storage location. While synchronously writing the stream of data blocks, the method includes determining an unrecoverable failure at the second storage location. The method also includes determining a failure point in the writing of the stream of data blocks that demarcates data blocks that were successfully written and not successfully written to the second storage location. The method also includes synchronously writing, starting at the failure point, the stream of data blocks to the first storage location and to a third storage location associated with a third geographical region.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification