Load history calculation in internal stage replication

    公开(公告)号:US11983165B1

    公开(公告)日:2024-05-14

    申请号:US18128212

    申请日:2023-03-29

    Applicant: Snowflake Inc.

    CPC classification number: G06F16/2365 G06F16/1748 G06F16/27

    Abstract: Embodiments of the present disclosure provide techniques for deduplicating files during internal stage replication using a directory table of the replicated internal stage that is modified as a cache for storing and retrieving original file-level metadata for the replicated files. An initial list of candidate files for loading from the internal stage to a table of the target deployment is prepared based on the files listed in the internal stage, and refined using a directory table lookup. If there is any inconsistency between the files registered in the directory table and the files listed in the internal stage, the target deployment will inspect the user-defined file-level metadata to obtain original file-level metadata for each file that is present in the internal stage but not in the directory table. This information may be used during deduplication to ensure that no duplicate files are loaded.

Patent Agency Ranking