Internal resource provisioning in database systems

    公开(公告)号:US11138214B2

    公开(公告)日:2021-10-05

    申请号:US16778954

    申请日:2020-01-31

    Applicant: Snowflake Inc.

    Abstract: Resource provisioning systems and methods are described. In an embodiment, a system includes a plurality of shared storage devices collectively storing database data, an execution platform, and a compute service manager. The compute service manager is configured to determine a task to be executed in response to a trigger event and determine a query plan for executing the task, wherein the query plan comprises a plurality of discrete subtasks. The compute service manager is further configured to assign the plurality of discrete subtasks to one or more nodes of a plurality of nodes of the execution platform, determine whether execution of the task is complete, and in response to determining the execution of the task is complete, store a record in the plurality of shared storage devices indicating the task was completed.

    SYSTEM AND METHOD FOR DISJUNCTIVE JOINS USING A LOOKUP TABLE

    公开(公告)号:US20210286817A1

    公开(公告)日:2021-09-16

    申请号:US17235826

    申请日:2021-04-20

    Applicant: Snowflake Inc.

    Abstract: Joining data using a disjunctive operator using a lookup table is described. An example computer-implemented method can include receiving a query with a set of conjunctive predicates and a set of disjunctive predicates. The method may also include generating a lookup table for each predicate in the sets of conjunctive predicates and disjunctive predicates. The method, for each row in a probe-side table, may also further include looking up a value associated with that row in each of the lookup tables and adding the row to a results set when there is a match. Additionally, the method may also include returning the results set.

    FRAMEWORK FOR PROVIDING INTERMEDIATE AGGREGATION OPERATORS IN A QUERY PLAN

    公开(公告)号:US20210263929A1

    公开(公告)日:2021-08-26

    申请号:US16939750

    申请日:2020-07-27

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation. The subject technology analyzes the at least one aggregation to generate a modified query plan, the modified query plan including at least a top aggregation operator, an intermediate aggregation operator, and a bottom aggregation operator. The subject technology performs, with respect to the intermediate aggregation operator, at least one operation comprising: the subject technology receives an input intermediate data type; the subject technology performs an internalize operation on the input intermediate data type to generate an internal state; the subject technology performs an accumulate operation on the internal state to generate intermediate data; and the subject technology performs an externalize operation on the intermediate data to generate an output data type.

    Maintaining states of partitions of a table for reclustering

    公开(公告)号:US10997215B2

    公开(公告)日:2021-05-04

    申请号:US17030565

    申请日:2020-09-24

    Applicant: Snowflake Inc.

    Abstract: The subject technology creates partitions based on changes to a table, at least one of the one or more partitions overlapping with respect to values of one or more attributes with at least one of another partition and a previous partition. The subject technology maintains states for the partitions, each state from the plurality of states representing a particular degree of clustering of the table. The subject technology determines a number of overlapping partitions and a depth of the overlapping partitions, and determines a clustering ratio based at least in part on the number of overlapping partitions and the depth. The subject technology reclusters partitions of the table to increase the clustering ratio, the clustering ratio determined by at least a proportion of rows in a layout of the table that satisfy an ordering criteria based at least in part a particular attribute of the one or more attributes.

    Placement of adaptive aggregation operators and properties in a query plan

    公开(公告)号:US10997173B2

    公开(公告)日:2021-05-04

    申请号:US16857790

    申请日:2020-04-24

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a query plan, the query plan comprising a set of query operations, the set of query operations including at least one aggregation and at least one join operation. The subject technology analyzes the query plan to identify an aggregation that is redundant. The subject technology removes the aggregation based at least in part on the analyzing. The subject technology determines at least one aggregation property corresponding to at least one query operation of the query plan. The subject technology inserts at least one adaptive aggregation operator in the query plan based at least in part on the at least one aggregation property. The subject technology provides a modified query plan based at least in part on the inserted at least one adaptive aggregation operator in the query plan.

    Incremental feature development and workload capture in database systems

    公开(公告)号:US10762067B2

    公开(公告)日:2020-09-01

    申请号:US16692927

    申请日:2019-11-22

    Applicant: Snowflake Inc.

    Abstract: Systems, methods, and devices for incremental feature development and workload capture in database systems are disclosed. A method includes determining a workload having one or more historical client queries to be rerun for testing the feature, wherein the feature comprises procedural logic. The method further includes executing a baseline run of the workload that does not implement the feature and executing a target run of the workload while implementing the feature. The method further includes comparing the baseline run and the target run to identify whether there is a performance regression in the target run. The method further includes, in response to identifying the performance regression, rerunning the target run to identify whether the performance regression still exists.

    MACHINE TIME ESTIMATION FOR CONTINUOUS MAINTENANCE OF CLUSTERED DATA

    公开(公告)号:US20250131017A1

    公开(公告)日:2025-04-24

    申请号:US18917533

    申请日:2024-10-16

    Applicant: Snowflake Inc.

    Abstract: A method includes sampling, by at least one hardware processor, a table using a clustering key to obtain a set of batches. Each batch of the set of batches includes a set of partitions of the table. A clustering job is performed for at least one batch of the set of batches. A machine processing cost associated with the clustering job is determined on a per-row basis. A total clustering cost associated with clustering data in the table is determined based on the machine processing cost on the per-row basis.

Patent Agency Ranking