Metadata clustering
    2.
    发明授权

    公开(公告)号:US11822582B2

    公开(公告)日:2023-11-21

    申请号:US17896446

    申请日:2022-08-26

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285

    Abstract: Embodiments of the present disclosure describe systems, methods, and computer program products for improving query processing of a database. An example method can include: storing table data for a table in a plurality of micro-partitions, each micro-partition comprising a portion of the table data for the table; for each micro-partition of the plurality of micro-partitions, storing metadata for the micro-partition in at least one of a plurality of expression properties; and selecting, by a processing device, a subset of the plurality of expression properties to be grouped into a grouping expression property based at least partially on the metadata of the subset of the plurality of the expression properties. The grouping expression property may include cumulative metadata associated with the metadata of the subset of the plurality of expression properties.

    MICRO-PARTITION CLUSTERING BASED ON EXPRESSION PROPERTY METADATA

    公开(公告)号:US20240354315A1

    公开(公告)日:2024-10-24

    申请号:US18302234

    申请日:2023-04-18

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285 G06F16/24556

    Abstract: A method for selecting micro-partitions for a clustering operation includes: storing table data in a plurality of micro-partitions of a storage device, wherein each of the plurality of micro-partitions comprises a portion of the table data, wherein subsets of the plurality of micro-partitions are associated with a respective one of a plurality of expression property (EP) files, and wherein each of the plurality of EP files comprises an EP data region that represents the portions of the table data of the subset of the plurality of micro-partitions associated with the EP file; determining sub-ranges of the table data based on the EP data regions of the plurality of EP files; selecting a subset of the plurality of EP files for a clustering operation based on the sub-ranges of the table data; and performing the clustering operation on the micro-partitions associated with the subset of the EP files.

    Merge small file consolidation
    5.
    发明授权

    公开(公告)号:US11537613B1

    公开(公告)日:2022-12-27

    申请号:US17514084

    申请日:2021-10-29

    Applicant: Snowflake Inc.

    Abstract: The subject technology receives a query plan corresponding to a query. The subject technology executes the query based at least in part on the query plan, the executing including: filtering a first set of files that are to be modified by a merge statement, performing a split operation to send information related to a second set of files to a scan set builder operation in a first portion of the query plan and scan back operation in a second portion of the query plan, performing the scan set builder operation to remove the second set of files from the first set of files, performing a table scan operation based on a third set of files, and performing a first union all operation to combine the first set of data with a second set of data as a first set of combined data.

    File defragmentation service
    10.
    发明授权

    公开(公告)号:US11593306B1

    公开(公告)日:2023-02-28

    申请号:US17587852

    申请日:2022-01-28

    Applicant: Snowflake Inc.

    Abstract: The subject technology selects a most recently created file from a set of files stored in a source table. The subject technology iterates, in the source table, starting from the most recently created file up to an age threshold to select a first set of files for performing a first defragmentation process. The subject technology sets an indication corresponding to a particular file that is a last file, from the first set of files, that meets the age threshold. The subject technology performs the first defragmentation process on the selected first set of files. The subject technology determines that the first defragmentation process was successful.

Patent Agency Ranking