-
公开(公告)号:US20230409574A1
公开(公告)日:2023-12-21
申请号:US18362898
申请日:2023-07-31
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Fabian Hueske , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
IPC: G06F16/2453 , G06F7/14
CPC classification number: G06F16/24539 , G06F7/14 , G06F16/24542
Abstract: A system for a materialized table (MT) refresh using multiple processing pipelines includes at least one hardware processor coupled to memory storing instructions. The instructions cause the at least one hardware processor to perform operations including determining dependencies among a plurality of intermediate MTs generated from a source MT. The source MT uses a table definition with a query on one or more base tables and a lag duration value. A graph snapshot of dependencies among the plurality of intermediate MTs is generated. Processing pipelines are configured. Each of the processing pipelines corresponds to a subset of the plurality of intermediate MTs indicated by the graph snapshot. Responsive to detecting an instruction for a refresh operation on the source MT, refreshes on corresponding intermediate MTs of the plurality of intermediate MTs in each processing pipeline of the processing pipelines are performed to complete the refresh operation on the source MT.
-
公开(公告)号:US11755568B1
公开(公告)日:2023-09-12
申请号:US17931705
申请日:2022-09-13
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
CPC classification number: G06F16/2393 , G06F11/3419
Abstract: Provided herein are systems and methods for a database object (e.g., materialized table) configuration including scheduling refreshes of the materialized table. For example, a method includes determining a dependency graph for a first MT. The dependency graph comprises a second MT from which the first MT depends. The first MT includes a query on one or more base tables and a lag duration value. The lag duration value indicates a maximum time period that a result of a prior refresh of the query can lag behind a current time instance. A tick period is selected for a set of ticks based on the lag duration value. The set of ticks corresponds to a set of aligned time instances. Refresh operations are scheduled for the first and second MTs at corresponding time instances from the set of aligned time instances. The corresponding time instances are separated by the tick period.
-
公开(公告)号:US12242457B2
公开(公告)日:2025-03-04
申请号:US18459256
申请日:2023-08-31
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Tyler Jones , Daniel Mills , Daniel E. Sotolongo
IPC: G06F16/23 , G06F16/22 , G06F16/2455 , G06F16/27
Abstract: Provided herein are systems and methods for a stream object configuration, including query processing of stream objects using stream expansion. For example, a method includes decoding a query to obtain a first data processing operation and a first stream object. The first stream object is associated with a view on a base table. A first stream expansion on the first stream object is performed. The first stream expansion is based on generating a second stream object on the base table. A second stream expansion of the second stream object is performed. The second stream expansion is based on replacing the second stream object with at least a second data processing operation. The query is executed based on completing the first data processing operation and the at least a second data processing operation.
-
公开(公告)号:US12216654B2
公开(公告)日:2025-02-04
申请号:US18362898
申请日:2023-07-31
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Fabian Hueske , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
IPC: G06F16/2453 , G06F7/14
Abstract: A system for a materialized table (MT) refresh using multiple processing pipelines includes at least one hardware processor coupled to memory storing instructions. The instructions cause the at least one hardware processor to perform operations including determining dependencies among a plurality of intermediate MTs generated from a source MT. The source MT uses a table definition with a query on one or more base tables and a lag duration value. A graph snapshot of dependencies among the plurality of intermediate MTs is generated. Processing pipelines are configured. Each of the processing pipelines corresponds to a subset of the plurality of intermediate MTs indicated by the graph snapshot. Responsive to detecting an instruction for a refresh operation on the source MT, refreshes on corresponding intermediate MTs of the plurality of intermediate MTs in each processing pipeline of the processing pipelines are performed to complete the refresh operation on the source MT.
-
公开(公告)号:US20240232224A1
公开(公告)日:2024-07-11
申请号:US18610863
申请日:2024-03-20
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Dinesh Chandrakant Kulkarni , Daniel Mills , Daniel E. Sotolongo , Di Fei Zhang
IPC: G06F16/27
CPC classification number: G06F16/273
Abstract: Techniques for triggering pipeline execution based on data change (transaction commit) are described. The pipelines can be used for data ingestion or other specified tasks. These tasks can be operational across account, organization, cloud region, and cloud provider boundaries. The tasks can be triggered by commit post-processing. Gates in the tasks can be set up to reference change data capture information. If the gate is satisfied, tasks can be executed to set up data pipelines.
-
公开(公告)号:US20230367757A1
公开(公告)日:2023-11-16
申请号:US18359322
申请日:2023-07-26
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Daniel E. Sotolongo
IPC: G06F16/22 , G06F16/23 , G06F16/2455
CPC classification number: G06F16/2282 , G06F16/2358 , G06F16/24568
Abstract: A system or persistent table may be generated storing changelog information of a primary base table. The system table may then be used to create streams of relevant information. In some examples, the streams may read from the system table for information past a retention period of the primary table while reading from the primary table information in the retention period.
-
公开(公告)号:US20230342377A1
公开(公告)日:2023-10-26
申请号:US18345949
申请日:2023-06-30
Applicant: Snowflake Inc.
Inventor: Istvan Cseri , Tyler Jones , Daniel E. Sotolongo , Boyuan Zhang
IPC: G06F16/27
CPC classification number: G06F16/27
Abstract: Techniques described herein can enable stream replication. A first deployment can store a table including one or more streams. The techniques described herein can be used to replicate the table at a second deployment while replicating the one or more streams associated with the table. Select prior table versions and partitions in the table are copied to the second deployment to enable stream replication.
-
公开(公告)号:US20230315755A1
公开(公告)日:2023-10-05
申请号:US18103977
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Dinesh Chandrakant Kulkarni , Daniel Mills , Daniel E. Sotolongo , Di Fei Zhang
IPC: G06F16/27
CPC classification number: G06F16/273
Abstract: Techniques for triggering pipeline execution based on data change (transaction commit) are described. The pipelines can be used for data ingestion or other specified tasks. These tasks can be operational across account, organization, cloud region, and cloud provider boundaries. The tasks can be triggered by commit post-processing. Gates in the tasks can be set up to reference change data capture information. If the gate is satisfied, tasks can be executed to set up data pipelines.
-
公开(公告)号:US20230237043A1
公开(公告)日:2023-07-27
申请号:US18158627
申请日:2023-01-24
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Fabian Hueske , Tyler Jones , Yevgeniy Kogan , Dzmitry Pauliukevich , Daniel E. Sotolongo
IPC: G06F16/23 , G06F16/2455 , G06F16/22
CPC classification number: G06F16/2358 , G06F16/2282 , G06F16/2456
Abstract: Techniques described herein can accelerate change data capture determinations such as stream reads, which show changes made to a table between two points in time. Three distinct row bitests that mark deleted, updated, inserted, rows in micro-partitions can be added as metadata for the table. These bitsets can be generated during DML operations and then stored as metadata of the new partition generated by the DML operations. The bitsets can then be used to generate streams showing the changes in the table between two points in time (changes interval).
-
公开(公告)号:US20230070152A1
公开(公告)日:2023-03-09
申请号:US18049325
申请日:2022-10-25
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Daniel E. Sotolongo , Zhuo Zhang
IPC: G06F16/2455 , G06F16/2453 , G06F16/25 , G06F16/22
Abstract: A streaming ingest platform can improve latency and expense issues related to uploading data into a cloud data system. The streaming ingest platform can organize the data to be ingested into per-table chunks and per-account blobs. This data may be committed and may be made available for query processing before it is ingested into the target source tables. This significantly improves latency issues. The streaming ingest platform can also accommodate uploading data from various sources with different processing and communication capabilities, such as Internet of Things (IOT) devices.
-
-
-
-
-
-
-
-
-