-
公开(公告)号:US20230070710A1
公开(公告)日:2023-03-09
申请号:US18054632
申请日:2022-11-11
Applicant: Google LLC
Inventor: Pavan Edara , Jonathan Forbes , Yang YI
IPC: G06F16/2457 , G06F16/22 , G06F16/2458 , G06F16/23 , G06F16/248
Abstract: A method for managing data processing includes receiving, from a user of a data query system, a data query for data stored in a data store in communication with the data query system. The method also includes receiving a staleness parameter indicating an upper time boundary for the data query. The upper time boundary limits a query response to data within the data store that is older than the upper time boundary. The method further includes determining whether the data stored within the data store satisfies the staleness parameter. When a portion of the data within the data store fails to satisfy the staleness parameter, the method includes generating the query response that excludes the portion of the data that fails to satisfy the staleness parameter.
-
公开(公告)号:US20220374455A1
公开(公告)日:2022-11-24
申请号:US17817147
申请日:2022-08-03
Applicant: Google LLC
Inventor: Hua Zhang , Pavan Edara , Nhan Nguyen
Abstract: A method for shuffle-less reclustering of clustered tables includes receiving a first and second group of clustered data blocks sorted by a clustering key value. A range of clustering key values of one or more the data blocks in the second group overlaps with the range of clustering key values of a data block in the first group. The method also includes generating split points for partitioning the first and second groups of clustered data blocks into a third group. The method also includes partitioning using the split points, the first and second groups into the third group. Each data block in the third group includes a range of clustering key values that do not overlap with any other data block in the third group. Each split point defines an upper limit or lower limit for the range of clustering key values a data block in the third group.
-
公开(公告)号:US11423049B2
公开(公告)日:2022-08-23
申请号:US16872238
申请日:2020-05-11
Applicant: Google LLC
Inventor: Seyed Omid Fatemieh , Mikhail Entin , Adrian Baras , Pavan Edara , Aleksandras Surna
Abstract: A method for execution-time dynamic range partitioning includes receiving user data including a partitioning key and a clustering key. The user data includes a respective number of total rows defining a total data size for the user data. The method also includes identifying storage constraints for the data storage system. The storage constraints include a target file size and a target number of rows per file. The method further includes determining a plurality of split points for the user data based on the storage constraints. The method also includes generating partitioning quantiles from the plurality of split points that define a range between each split point of the plurality of split points. The method further includes range partitioning each row of the user data into files using the partitioning quantiles.
-
公开(公告)号:US20210382892A1
公开(公告)日:2021-12-09
申请号:US17445422
申请日:2021-08-19
Applicant: Google LLC
Inventor: Pavan Edara , Yang Yi
IPC: G06F16/2458 , G06F16/2455
Abstract: A method for managing metadata for a transactional storage system include receiving a query request at a snapshot timestamp. The query request requests return of at least one data block from a plurality of data blocks. Each data block includes a corresponding write epoch timestamp and a corresponding conversion indicator indicating whether the data block is active or has been converted at a respective conversion timestamp. The method also includes setting a read epoch timestamp equal to the earliest one of the write epoch and determining whether any of the respective conversion timestamps occurring at or before the snapshot timestamp occur after the read epoch timestamp. The method also includes determining the at least one data block requested by the query request by scanning each of the data blocks including corresponding write epoch timestamps occurring at or after the read epoch timestamp.
-
公开(公告)号:US20210319031A1
公开(公告)日:2021-10-14
申请号:US16848833
申请日:2020-04-14
Applicant: Google LLC
Inventor: Pavan Edara , Jonathan Forbes , Yang Yi
IPC: G06F16/2457 , G06F16/22 , G06F16/2458 , G06F16/248 , G06F16/23
Abstract: A method for managing data processing includes receiving, from a user of a data query system, a data query for data stored in a data store in communication with the data query system. The method also includes receiving a staleness parameter indicating an upper time boundary for the data query. The upper time boundary limits a query response to data within the data store that is older than the upper time boundary. The method further includes determining whether the data stored within the data store satisfies the staleness parameter. When a portion of the data within the data store fails to satisfy the staleness parameter, the method includes generating the query response that excludes the portion of the data that fails to satisfy the staleness parameter.
-
公开(公告)号:US11113296B1
公开(公告)日:2021-09-07
申请号:US16848780
申请日:2020-04-14
Applicant: Google LLC
Inventor: Pavan Edara , Yang Yi
IPC: G06F16/24 , G06F16/2458 , G06F16/2455
Abstract: A method for managing metadata for a transactional storage system include receiving a query request at a snapshot timestamp. The query request requests return of at least one data block from a plurality of data blocks. Each data block includes a corresponding write epoch timestamp and a corresponding conversion indicator indicating whether the data block is active or has been converted at a respective conversion timestamp. The method also includes setting a read epoch timestamp equal to the earliest one of the write epoch and determining whether any of the respective conversion timestamps occurring at or before the snapshot timestamp occur after the read epoch timestamp. The method also includes determining the at least one data block requested by the query request by scanning each of the data blocks including corresponding write epoch timestamps occurring at or after the read epoch timestamp.
-
-
-
-
-