Invention Grant
- Patent Title: K-D tree balanced splitting
-
Application No.: US17738609Application Date: 2022-05-06
-
Publication No.: US12061586B2Publication Date: 2024-08-13
- Inventor: Bart Samwel , Prakhar Jain
- Applicant: Databricks, Inc.
- Applicant Address: US CA San Francisco
- Assignee: Databricks, Inc.
- Current Assignee: Databricks, Inc.
- Current Assignee Address: US CA San Francisco
- Agency: Fenwick & West LLP
- Main IPC: G06F16/22
- IPC: G06F16/22 ; G06F16/28

Abstract:
A system for clustering data into corresponding files comprises one or more processors and a memory. The one or more processors is/are configured to: 1) determine to cluster a set of data into a set of files; 2) determine a set of split points in a corresponding set of dimensions of the set of data to determine the set of files, wherein each file of the set of files has an approximate target size; and 3) store one or more items of the set of data into a corresponding file of the set of files based at least in part on the set of split points. The memory is coupled to the one or more processors and configured to provide the processor with instructions.
Public/Granted literature
- US20230359602A1 K-D TREE BALANCED SPLITTING Public/Granted day:2023-11-09
Information query