-
公开(公告)号:US20220207041A1
公开(公告)日:2022-06-30
申请号:US17655124
申请日:2022-03-16
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Thierry Cruanes , Ismail Oukid , Stefan Richter
IPC: G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A source table organized into a set of batch units is accessed. The source table comprises a column of data corresponding to a semi-structured data type. One or more indexing transformations for an object in the column are generated. The generating of the one or more indexing transformation includes converting the object to one or more stored data types. A pruning index is generated for the source table based in part on the one or more indexing transformations. The pruning index comprises a set of filters that index distinct values in each column of the source table, and each filter corresponds to a batch unit in the set of batch units. The pruning index is stored in a database with an association with the source table.
-
公开(公告)号:US11372888B2
公开(公告)日:2022-06-28
申请号:US17524454
申请日:2021-11-11
Applicant: SNOWFLAKE INC.
Inventor: Benoit Dageville , Thierry Cruanes , Marcin Zukowski , Allison Waingold Lee , Philipp Thomas Unterbrunner
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , A61F5/56 , H04L67/568 , G06F9/48 , H04L67/1095 , H04L67/1097
Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components. An example method includes: receiving a relational join query for a join operation associated with a first relation and a second relation; generating at least one build operator and at least one probe operator to perform build operations and probe operations, respectively, of the join operation; and managing a state of one or more communication links between the at least one build operator and the at least one probe operator based on a size of the second relation as determined by the at least one build operator and an estimated size of the first relation.
-
公开(公告)号:US11347735B2
公开(公告)日:2022-05-31
申请号:US16889033
申请日:2020-06-01
Applicant: Snowflake Inc.
Inventor: Thierry Cruanes , Igor Demura , Varun Ganesh , Prasanna Rajaperumal , Libo Wang , Jiaqi Yan
IPC: G06F16/2453
Abstract: Embodiments of the present disclosure may provide a dynamic query execution model. This query execution model may provide acceleration by scaling out parallel parts of a query (also referred to as a fragment) to additional computing resources, for example computing resources leased from a pool of computing resources. Execution of the parts of the query may be coordinated by a parent query coordinator, where the query originated, and a fragment query coordinator.
-
公开(公告)号:US20220156282A1
公开(公告)日:2022-05-19
申请号:US17666209
申请日:2022-02-07
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/ Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US20220129479A1
公开(公告)日:2022-04-28
申请号:US17570638
申请日:2022-01-07
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Allison Waingold Lee
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L67/1095 , H04L67/568 , H04L67/1097
Abstract: A system and method for managing data storage and data access with querying data in a distributed system without buffering the results on intermediate operations in disk storage.
-
公开(公告)号:US11308090B2
公开(公告)日:2022-04-19
申请号:US17394149
申请日:2021-08-04
Applicant: Snowflake Inc.
Inventor: Mahmud Allahverdiyev , Thierry Cruanes , Ismail Oukid , Stefan Richter
IPC: G06F16/24 , G06F16/2455 , G06F16/9035 , G06F16/28 , G06F17/18 , G06F16/22
Abstract: A source table organized into a set of batch units is accessed. The source table comprises a column of data corresponding to a semi-structured data type. One or more indexing transformations for an object in the column are generated. The generating of the one or more indexing transformation includes converting the object to one or more stored data types. A pruning index is generated for the source table based in part on the one or more indexing transformations. The pruning index comprises a set of filters that index distinct values in each column of the source table, and each filter corresponds to a batch unit in the set of batch units. The pruning index is stored in a database with an association with the source table.
-
公开(公告)号:US11269869B2
公开(公告)日:2022-03-08
申请号:US17498382
申请日:2021-10-11
Applicant: Snowflake Inc.
Inventor: Subramanian Muralidhar , Benoit Dageville , Thierry Cruanes , Nileema Shingte , Saurin Shah , Torsten Grabs , Istvan Cseri
IPC: G06F16/20 , G06F16/242 , G06F3/06 , G06F16/2453 , G06F16/25 , G06F16/23
Abstract: Disclosed herein are systems and methods for processing queries over external tables. In an embodiment, a database platform receives a query directed at least to data in an external table stored in a storage platform that is external to the database platform. The database platform uses metadata that summarizes the data in the external table to identify one or more partitions of the external table as potentially including data satisfying the query, and generates a query plan that includes a plurality of discrete subtasks that collectively include instructions to scan the identified one or more partitions of the external table for data satisfying the query. The database platform assigns, based on the metadata, the plurality of discrete subtasks to one or more nodes in an execution platform, and refreshes the metadata in response to a threshold number of modifications being made to the external table.
-
公开(公告)号:US20220067016A1
公开(公告)日:2022-03-03
申请号:US17511064
申请日:2021-10-26
Applicant: Snowflake Inc.
Inventor: Jiaqi Yan , Thierry Cruanes , Jeffrey Rosen , William Waddington , Prasanna Rajaperumal , Abdul Munir
Abstract: The subject technology determines whether a table is sufficiently clustered. The subject technology in response to determining the table is not sufficiently clustered, selects one or more micro-partitions of the table to be reclustered. The subject technology constructs a data structure for the table. The subject technology extracts minimum and maximum endpoints for each micro-partition in the data structure. The subject technology sorts each of one or more peaks in the data structure based on height. The subject technology sorts overlapping micro-partitions based on width. The subject technology selects based on which micro-partitions are within the tallest peaks of the one or more peaks and further based on which of the overlapping micro-partitions have the widest widths.
-
公开(公告)号:US11238062B2
公开(公告)日:2022-02-01
申请号:US17385754
申请日:2021-07-26
Applicant: SNOWFLAKE INC.
Inventor: Thierry Cruanes , Benoit Dageville , Marcin Zukowski
IPC: G06F16/27 , G06F16/182 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L29/08
Abstract: A method and apparatus managing a set of processors for a set of queries is described. In an exemplary embodiment, a device receives a set of queries for a data warehouse, the set of queries including one or more queries to be processed by the data warehouse. The device further provisions a set of processors from a first plurality of processors, where the set of processors to process the set of queries, and a set of storage resources to store data for the set of queries. In addition, the device monitors a utilization of the set of processors as the set of processors processes the set of queries. The device additionally updates a number of the processors in the set of processors provisioned based on the utilization/Furthermore, the device processes the set of queries using the updated set of processors.
-
公开(公告)号:US11238061B2
公开(公告)日:2022-02-01
申请号:US17327573
申请日:2021-05-21
Applicant: SNOWFLAKE INC.
Inventor: Benoit Dageville , Thierry Cruanes , Marcin Zukowski , Allison Waingold Lee , Philipp Thomas Unterbrunner
IPC: G06F16/27 , G06F9/50 , G06F16/14 , G06F16/21 , G06F16/22 , G06F16/951 , G06F16/182 , G06F16/23 , G06F16/2455 , G06F16/2458 , G06F16/9535 , G06F16/2453 , G06F9/48 , H04L29/08
Abstract: A method, apparatus, and system for join operations of a plurality of relations that are distributed over a plurality of storage locations over a network of computing components.
-
-
-
-
-
-
-
-
-