Patent search ap:"Databricks Inc." Page 12

111.

发明授权
Structured cluster execution for data streams 有权

公开(公告)号：US10558664B2

公开(公告)日：2020-02-11

申请号：US15581647

申请日：2017-04-28

Applicant: Databricks Inc.

Inventor： Michael Armbrust , Tathagata Das , Shi Xin , Matei Zaharia

IPC: G06F16/2453 , G06F16/2455

Abstract: A system for executing a streaming query includes an interface and a processor. The interface is configured to receive a logical query plan. The processor is configured to determine a physical query plan based at least in part on the logical query plan. The physical query plan comprises an ordered set of operators. Each operator of the ordered set of operators comprises an operator input mode and an operator output mode. The processor is further configured to execute the physical query plan using the operator input mode and the operator output mode for each operator of the query.

112.

发明授权
Multiple display views for a notebook 有权

公开(公告)号：US10474736B1

公开(公告)日：2019-11-12

申请号：US14979253

申请日：2015-12-22

Applicant: Databricks Inc.

Inventor： Ion Stoica , Ali Ghodsi , Chaoyu Yang

IPC: G06F17/21 , G06F17/24 , G06F17/22 , G06F3/0481

Abstract: A system for multiple views for a notebook includes an input interface and a processor. The input interface to receive a notebook. The processor is to load the notebook into a shell, wherein the shell executes the notebook using a cluster, to receive an indication to view a dashboard associated with the notebook, and to provide dashboard display information. The dashboard includes a page layout display.

113.

发明授权
Serverless execution of code using cluster resources 有权

公开(公告)号：US10474501B2

公开(公告)日：2019-11-12

申请号：US15581987

申请日：2017-04-28

Applicant: Databricks Inc.

Inventor： Ali Ghodsi , Srinath Shankar , Sameer Paranjpye , Shi Xin , Matei Zaharia

IPC: G06F9/50

Abstract: A system for cluster resource allocation includes an interface and a processor. The interface is configured to receive a process and input data. The processor is configured to determine an estimate for resources required for the process to process the input data; determine existing available resources in a cluster for running the process; determine whether the existing available resources are sufficient for running the process; in the event it is determined that the existing available resources are not sufficient for running the process, indicate to add new resources; determine an allocated share of resources in the cluster for running the process; and cause execution of the process using the share of resources.

114.

发明授权
Callable notebook for cluster execution 有权

公开(公告)号：US10296329B2

公开(公告)日：2019-05-21

申请号：US15803604

申请日：2017-11-03

Applicant: Databricks Inc.

Inventor： Timothee Hunter , Ali Ghodsi , Ion Stoica

IPC: G06F8/54 , G06F8/71 , G06F9/50 , G06F9/445 , G06F9/455 , G06F16/9535

Abstract: A system for processing a notebook includes an input interface and a processor. The input interface is to receive a first notebook. The notebook comprises code for interactively querying and viewing data. The processor is to load the first notebook into a shell. The shell receives one or more parameters associated with the first notebook. The shell executes the first notebook using a cluster.

115.

发明申请
EVALUATING EXPRESSIONS OVER DICTIONARY DATA 有权

公开(公告)号：US20250156423A1

公开(公告)日：2025-05-15

申请号：US19000466

申请日：2024-12-23

Applicant: Databricks, Inc.

Inventor： Utkarsh Agarwal , Shoumik Palkar , Alexander Behm , Sriram Krishnamurthy

IPC: G06F16/2455 , G06F11/34 , G06F16/22

Abstract: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least an operator for a columnar dataset on cloud storage. At least one column in the dataset is based on a dictionary, and the dictionary maps one or more values for a column to one or more respective identifiers. The method evaluates the operator on one or more values of the dictionary to generate an updated dictionary comprising updated values. The method may decode the updated dictionary into an updated column comprising updated data values.

116.

发明授权
Multiple pass sort with subset splitting 有权

公开(公告)号：US12298952B1

公开(公告)日：2025-05-13

申请号：US17875180

申请日：2022-07-27

Applicant: Databricks, Inc.

Inventor： Timothy Armstrong , Arvind Sai Krishnan , Khayyam Guliyev

IPC: G06F16/22 , G06F16/2453

Abstract: A system for multipass sort with subsplitting includes a communication interface and a processor. The communication interface is configured to receive from a client device a request to sort a dataset that includes a plurality of rows, where the size of the dataset is greater than a threshold size. The processor is configured to: subdivide the dataset into a plurality of data subsets; sort each of the plurality of data subsets; merge the plurality of sorted data subsets utilizing a binary merge tree to generate a sorted dataset; and provide the sorted dataset to the client device.

117.

发明申请
DATA SHARING FOR NETWORK CONNECTED SYSTEMS 有权

公开(公告)号：US20250131118A1

公开(公告)日：2025-04-24

申请号：US18958728

申请日：2024-11-25

Applicant: Databricks, Inc.

Inventor： Matei Zaharia , Shixiong Zhu , Xiaotong Sun , Ramesh Chandra , Michael Paul Armbrust , Ali Ghodsi

IPC: G06F21/62 , G06F21/60

Abstract: The present application discloses a method, system, and computer system for providing access to data. The method includes receiving, by a data manager service from a data requesting service, a request using an identifier for a high-level data object to access a set of data associated with the high-level data object, determining, by the data manager service, low-level data object(s) corresponding to the set of data based on the identifier for the high-level data object, determining whether a user associated with the request has permission to access at least a subset of the low-level data object(s), and in response to determining that the user associated has permission to access the at least the subset of the low-level data object(s), generating, by the data manager service, a uniform resource locator (URL) via which the at least the subset of the one or more low-level data objects is accessible by the user.

118.

发明申请
AUTO MAINTENANCE FOR DATA TABLES IN CLOUD STORAGE 有权

公开(公告)号：US20250130981A1

公开(公告)日：2025-04-24

申请号：US18986345

申请日：2024-12-18

Applicant: Databricks, Inc.

Inventor： Vijayan Prabhakaran , Himanshu Raja , Rahul Potharaju , Naga Raju Bhanoori , Lin Ma , Rajesh Parangi Sharabhalingappa , Jintian Liang , Zachary Vaughn Schuermann , Kam Cheung Ting

IPC: G06F16/21 , G06F11/34 , G06F16/22

Abstract: Disclosed is a configuration for managing the organization of data tables in cloud-based storage. The configuration receives metrics for data processing operations on the data table. Metrics include at least one of a size of the data table, a size of each file in the data table, and metadata describing the data table. The configuration automatically executes a cost-benefit analysis based on the one or more metrics for each candidate maintenance operation in a plurality of candidate maintenance operations. The configuration automatically selects a maintenance operation from the candidate maintenance operations to automate based on the cost-benefit analysis of the one or more candidate maintenance operations. The selected maintenance operation is automated and scheduled on the data table.

119.

发明申请
USING LLM FUNCTIONS TO EVALUATE AND COMPARE LARGE TEXT OUTPUTS OF LLMS 有权

公开(公告)号：US20250124236A1

公开(公告)日：2025-04-17

申请号：US18518155

申请日：2023-11-22

Applicant: Databricks, Inc.

Inventor： Ridhima Gupta , Prithvi Kannan , Sunish Sohil Sheth , Kasey Uhlenhuth , Hubert Zub , Corey Zumar

IPC: G06F40/40 , G06F40/103 , G06F40/30

Abstract: A method for evaluating textual output of one or more machine-learned language models is presented. The method includes receiving, from a user of a client device, a first prompt for input to one or more machine-learned language models, providing the first prompt to the one or more models for execution, and receiving a set of generated responses to the first prompt from the one or more models. The method further includes generating a user interface (UI) on the client device displaying the first prompt and generated responses as a table user interface element. The method applies a selected evaluation function to the generated response to evaluate the response with respect to an evaluation objective and identifies words that influence the evaluation. The method generates one or more UI elements on the UI to display the results of the evaluation for the generated responses.

120.

发明授权
Clean room generation for data collaboration and executing clean room task in data processing pipeline 有权

公开(公告)号：US12260003B1

公开(公告)日：2025-03-25

申请号：US18474708

申请日：2023-09-26

Applicant: Databricks, Inc.

Inventor： William Chau , Abhijit Chakankar , Stephen Michael Mahoney , Daniel Seth Morris , Itai Shlomo Weiss

IPC: G06F21/00 , G06F21/62

Abstract: A data processing service facilitates the creation and processing of data processing pipelines that process data processing jobs defined with respect to a set of tasks in a sequence and with data dependencies associated with each separate task such that the output from one task is used as input for a subsequent task. In various embodiments, the set of tasks include at least one cleanroom task that is executed in a cleanroom station and at least one non-cleanroom task executed in an execution environment of a user where each task is configured to read one or more input datasets and transform the one or more input datasets into one or more output datasets.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification