-
公开(公告)号:US12235874B2
公开(公告)日:2025-02-25
申请号:US18610863
申请日:2024-03-20
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Dinesh Chandrakant Kulkarni , Daniel Mills , Daniel E. Sotolongo , Di Fei Zhang
Abstract: Techniques for triggering pipeline execution based on data change (transaction commit) are described. The pipelines can be used for data ingestion or other specified tasks. These tasks can be operational across account, organization, cloud region, and cloud provider boundaries. The tasks can be triggered by commit post-processing. Gates in the tasks can be set up to reference change data capture information. If the gate is satisfied, tasks can be executed to set up data pipelines.
-
公开(公告)号:US11966416B2
公开(公告)日:2024-04-23
申请号:US18103977
申请日:2023-01-31
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Dinesh Chandrakant Kulkarni , Daniel Mills , Daniel E. Sotolongo , Di Fei Zhang
CPC classification number: G06F16/273
Abstract: Techniques for triggering pipeline execution based on data change (transaction commit) are described. The pipelines can be used for data ingestion or other specified tasks. These tasks can be operational across account, organization, cloud region, and cloud provider boundaries. The tasks can be triggered by commit post-processing. Gates in the tasks can be set up to reference change data capture information. If the gate is satisfied, tasks can be executed to set up data pipelines.
-
公开(公告)号:US20230409574A1
公开(公告)日:2023-12-21
申请号:US18362898
申请日:2023-07-31
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Fabian Hueske , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
IPC: G06F16/2453 , G06F7/14
CPC classification number: G06F16/24539 , G06F7/14 , G06F16/24542
Abstract: A system for a materialized table (MT) refresh using multiple processing pipelines includes at least one hardware processor coupled to memory storing instructions. The instructions cause the at least one hardware processor to perform operations including determining dependencies among a plurality of intermediate MTs generated from a source MT. The source MT uses a table definition with a query on one or more base tables and a lag duration value. A graph snapshot of dependencies among the plurality of intermediate MTs is generated. Processing pipelines are configured. Each of the processing pipelines corresponds to a subset of the plurality of intermediate MTs indicated by the graph snapshot. Responsive to detecting an instruction for a refresh operation on the source MT, refreshes on corresponding intermediate MTs of the plurality of intermediate MTs in each processing pipeline of the processing pipelines are performed to complete the refresh operation on the source MT.
-
公开(公告)号:US11755568B1
公开(公告)日:2023-09-12
申请号:US17931705
申请日:2022-09-13
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
CPC classification number: G06F16/2393 , G06F11/3419
Abstract: Provided herein are systems and methods for a database object (e.g., materialized table) configuration including scheduling refreshes of the materialized table. For example, a method includes determining a dependency graph for a first MT. The dependency graph comprises a second MT from which the first MT depends. The first MT includes a query on one or more base tables and a lag duration value. The lag duration value indicates a maximum time period that a result of a prior refresh of the query can lag behind a current time instance. A tick period is selected for a set of ticks based on the lag duration value. The set of ticks corresponds to a set of aligned time instances. Refresh operations are scheduled for the first and second MTs at corresponding time instances from the set of aligned time instances. The corresponding time instances are separated by the tick period.
-
公开(公告)号:US20250117382A1
公开(公告)日:2025-04-10
申请号:US18988025
申请日:2024-12-19
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Fabian Hueske , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
IPC: G06F16/2453
Abstract: A system includes at least one hardware processor and at least one memory storing instructions that cause the at least one hardware processor to perform operations. The operations include generating a log of changes posted to a plurality of intermediate materialized tables (MTs) during execution of a query in a network-based database system. The query is associated with a source MT that the intermediate MTs depend on. The operations include rendering the log of changes into a dependency graph. The operations include configuring a plurality of processing pipelines based on the dependency graph. The operations include performing refreshes on one or more of the plurality of intermediate MTs in at least one of the plurality of processing pipelines to complete the refresh operation. The refreshes are performed responsive to detecting an instruction for a refresh operation on the source MT.
-
公开(公告)号:US12189616B2
公开(公告)日:2025-01-07
申请号:US18353317
申请日:2023-07-17
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
Abstract: A method includes retrieving a plurality of materialized tables (MTs). Each of the plurality of MTs includes a lag duration and refers to a corresponding base table of a plurality of base tables. The lag duration indicates a maximum time period that a result of a prior refresh of a query on the corresponding base table can lag behind a current time instance. A plurality of time instances for the MT is determined based on the lag duration and a number of prior refreshes of the corresponding base table. A plurality of aligned time instances for the plurality of MTs is determined based on the plurality of time instances for each of the plurality of MTs. Refresh operations are scheduled for the plurality of MTs at one or more of the plurality of aligned time instances that are within the maximum time period.
-
公开(公告)号:US20240143548A1
公开(公告)日:2024-05-02
申请号:US18050122
申请日:2022-10-27
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Thierry Cruanes , Benoit Dageville , Ganeshan Ramachandran Iyer , Subramanian Muralidhar
CPC classification number: G06F16/148 , G06F9/5022 , G06F16/116
Abstract: Techniques for continuous ingestion of files using custom file formats are described. A custom file format may include formats not natively supported by a data system. Unstructured files (e.g., images) may also be considered custom file formats. A custom file format may be set using a user defined table function and scanner options.
-
公开(公告)号:US20240126765A1
公开(公告)日:2024-04-18
申请号:US18392327
申请日:2023-12-21
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Daniel E. Sotolongo , Zhuo Zhang
IPC: G06F16/2455 , G06F16/22 , G06F16/2453 , G06F16/25
CPC classification number: G06F16/24568 , G06F16/2219 , G06F16/24544 , G06F16/2456 , G06F16/258
Abstract: A streaming ingest platform can improve latency and expense issues related to uploading data into a cloud data system. The streaming ingest platform can organize the data to be ingested into per-table chunks and per-account blobs. This data may be committed and may be made available for query processing before it is ingested into the target source tables. This significantly improves latency issues. The streaming ingest platform can also accommodate uploading data from various sources with different processing and communication capabilities, such as Internet of Things (IOT) devices.
-
公开(公告)号:US11893029B2
公开(公告)日:2024-02-06
申请号:US18049325
申请日:2022-10-25
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Istvan Cseri , Tyler Jones , Daniel E. Sotolongo , Zhuo Zhang
IPC: G06F16/245 , G06F16/25 , G06F16/22 , G06F16/2455 , G06F16/2453
CPC classification number: G06F16/24568 , G06F16/2219 , G06F16/2456 , G06F16/24544 , G06F16/258
Abstract: A streaming ingest platform can improve latency and expense issues related to uploading data into a cloud data system. The streaming ingest platform can organize the data to be ingested into per-table chunks and per-account blobs. This data may be committed and may be made available for query processing before it is ingested into the target source tables. This significantly improves latency issues. The streaming ingest platform can also accommodate uploading data from various sources with different processing and communication capabilities, such as Internet of Things (IOT) devices.
-
公开(公告)号:US20230401199A1
公开(公告)日:2023-12-14
申请号:US18353317
申请日:2023-07-17
Applicant: Snowflake Inc.
Inventor: Tyler Arthur Akidau , Tyler Jones , Daniel Mills , Leon Papke , Prasanna Rajaperumal , Daniel E. Sotolongo
CPC classification number: G06F16/2393 , G06F11/3419
Abstract: A method includes retrieving a plurality of materialized tables (MTs). Each of the plurality of MTs includes a lag duration and refers to a corresponding base table of a plurality of base tables. The lag duration indicates a maximum time period that a result of a prior refresh of a query on the corresponding base table can lag behind a current time instance. A plurality of time instances for the MT is determined based on the lag duration and a number of prior refreshes of the corresponding base table. A plurality of aligned time instances for the plurality of MTs is determined based on the plurality of time instances for each of the plurality of MTs. Refresh operations are scheduled for the plurality of MTs at one or more of the plurality of aligned time instances that are within the maximum time period.
-
-
-
-
-
-
-
-
-