MACHINE TIME ESTIMATION FOR CONTINUOUS MAINTENANCE OF CLUSTERED DATA

    公开(公告)号:US20250131017A1

    公开(公告)日:2025-04-24

    申请号:US18917533

    申请日:2024-10-16

    Applicant: Snowflake Inc.

    Abstract: A method includes sampling, by at least one hardware processor, a table using a clustering key to obtain a set of batches. Each batch of the set of batches includes a set of partitions of the table. A clustering job is performed for at least one batch of the set of batches. A machine processing cost associated with the clustering job is determined on a per-row basis. A total clustering cost associated with clustering data in the table is determined based on the machine processing cost on the per-row basis.

    METADATA CLUSTERING
    14.
    发明公开
    METADATA CLUSTERING 审中-公开

    公开(公告)号:US20230229676A1

    公开(公告)日:2023-07-20

    申请号:US17896446

    申请日:2022-08-26

    Applicant: SNOWFLAKE INC.

    CPC classification number: G06F16/285

    Abstract: Embodiments of the present disclosure describe systems, methods, and computer program products for improving query processing of a database. An example method can include: storing table data for a table in a plurality of micro-partitions, each micro-partition comprising a portion of the table data for the table; for each micro-partition of the plurality of micro-partitions, storing metadata for the micro-partition in at least one of a plurality of expression properties; and selecting, by a processing device, a subset of the plurality of expression properties to be grouped into a grouping expression property based at least partially on the metadata of the subset of the plurality of the expression properties. The grouping expression property may include cumulative metadata associated with the metadata of the subset of the plurality of expression properties.

Patent Agency Ranking