-
公开(公告)号:US20220092043A1
公开(公告)日:2022-03-24
申请号:US17324907
申请日:2021-05-19
Applicant: Databricks Inc.
Inventor: Aaron Daniel Davidson , Tomas Nykodym , Clemens Mewald
IPC: G06F16/21 , G06F16/955
Abstract: A system includes an interface, a processor, and a memory. The interface is configured to receive a version of a model from a model registry. The processor is configured to store the version of the model, start a process running the version of the model, and update a proxy with version information associated with the version of the model, wherein the updated proxy indicates to redirect an indication to invoke the version of the model to the process. The memory is coupled to the processor and configured to provide the processor with instructions.
-
公开(公告)号:US20220083410A1
公开(公告)日:2022-03-17
申请号:US17537124
申请日:2021-11-29
Applicant: Databricks Inc.
Inventor: Alicja Luszczak , Srinath Shankar , Shi Xin
Abstract: A system for monitoring job execution includes an interface and a processor. The interface is configured to receive an indication to start a cluster processing job. The processor is configured to determine whether processing a data instance associated with the cluster processing job satisfies a watchdog criterion; and in the event that processing the data instance satisfies the watchdog criterion, cause the processing of the data instance to be killed.
-
公开(公告)号:US10691433B2
公开(公告)日:2020-06-23
申请号:US16119802
申请日:2018-08-31
Applicant: Databricks Inc.
Inventor: Srinath Shankar , Eric Keng-hao Liang , Gregory George Owen
IPC: G06F9/44 , G06F8/41 , G06F8/54 , G06F8/70 , G06F11/36 , G06F11/07 , G06F21/62 , G06F16/23 , G06F16/907
Abstract: A system for code development and execution includes a client interface and a client processor. The client interface is configured to receive user code for execution and receive an indication of a server that will perform the execution. The client processor is configured to parse the user code to identify one or more data items referred to during the execution. The client processor is also configured to provide the server with an inquiry for metadata regarding the one or more data items, receive the metadata regarding the one or more data items, determine a logical plan based at least in part on the metadata regarding the one or more data items; and provide the logical plan to the server for execution.
-
公开(公告)号:US09959337B2
公开(公告)日:2018-05-01
申请号:US15485952
申请日:2017-04-12
Applicant: Databricks Inc.
Inventor: Ali Ghodsi , Ion Stoica
CPC classification number: G06F17/30598 , G06F9/5033 , G06F9/5072 , G06F2209/505
Abstract: A cluster system includes an interface and a processor. The interface is to receive a request from a user associated with one of a plurality of shells. The processor is to determine a plurality of tasks to respond to the request; determine a local set of data and a shared set of data for a task of the plurality of tasks, wherein the local set of data is associated with the one of the plurality of shells; and provide the task, a local set indication, and a shared set indication to a worker associated with the task, wherein the local set indication refers to the local set of data and the shared set indication refers to the shared set of data.
-
公开(公告)号:US09769032B1
公开(公告)日:2017-09-19
申请号:US14663748
申请日:2015-03-20
Applicant: Databricks Inc.
Inventor: Ali Ghodsi , Ion Stoica , Matei Zaharia
CPC classification number: H04L41/5051 , G06F11/30 , H04L41/5096 , H04L43/0817
Abstract: A system for cluster management comprises a status monitor and an instance replacement manager. The status monitor is for monitoring status of an instance of a set of instances on a cluster provider. The instance replacement manager is for determining a replacement strategy for the instance in the event the instance does not respond. The replacement strategy for the instance is based at least in part on a management criteria for on-demand instances and spot instances on the cluster provider.
-
公开(公告)号:US09760602B1
公开(公告)日:2017-09-12
申请号:US14621950
申请日:2015-02-13
Applicant: Databricks Inc.
Inventor: Ali Ghodsi , Ion Stoica , Matei Zaharia
CPC classification number: G06F17/30424 , G06F17/30389
Abstract: A system for exploring data in a database comprises a query parser, a parameter manager, a query submitter, and a result formatter. The query parser is to receive a base query and determine an input parameter from the base query. The parameter manager is to provide a first request for a value for the input parameter; receive the value for the input parameter; and provide a second request for the value for the input parameter. The query submitter is to determine a first query using the base query and the value for the input parameter; and provide an indication to execute the first query. The result formatter is to receive a result associated with the indication to execute the first query.
-
公开(公告)号:US09659081B1
公开(公告)日:2017-05-23
申请号:US14824989
申请日:2015-08-12
Applicant: Databricks Inc.
Inventor: Ali Ghodsi , Ion Stoica
CPC classification number: G06F17/30598 , G06F9/5033 , G06F9/5072 , G06F2209/505
Abstract: A cluster system includes an interface and a processor. The interface is to receive a request from a user associated with one of a plurality of shells. The processor is to determine a plurality of tasks to respond to the request; determine a local set of data and a shared set of data for a task of the plurality of tasks, wherein the local set of data is associated with the one of the plurality of shells; and provide the task, a local set indication, and a shared set indication to a worker associated with the task, wherein the local set indication refers to the local set of data and the shared set indication refers to the shared set of data.
-
公开(公告)号:US20250156423A1
公开(公告)日:2025-05-15
申请号:US19000466
申请日:2024-12-23
Applicant: Databricks, Inc.
Inventor: Utkarsh Agarwal , Shoumik Palkar , Alexander Behm , Sriram Krishnamurthy
IPC: G06F16/2455 , G06F11/34 , G06F16/22
Abstract: Disclosed herein is a method, system, or non-transitory computer readable medium for evaluating a query on a columnar dataset comprising one or more dictionaries associated with columns in the dataset. The method includes receiving a request to perform a query comprising at least an operator for a columnar dataset on cloud storage. At least one column in the dataset is based on a dictionary, and the dictionary maps one or more values for a column to one or more respective identifiers. The method evaluates the operator on one or more values of the dictionary to generate an updated dictionary comprising updated values. The method may decode the updated dictionary into an updated column comprising updated data values.
-
公开(公告)号:US12298952B1
公开(公告)日:2025-05-13
申请号:US17875180
申请日:2022-07-27
Applicant: Databricks, Inc.
Inventor: Timothy Armstrong , Arvind Sai Krishnan , Khayyam Guliyev
IPC: G06F16/22 , G06F16/2453
Abstract: A system for multipass sort with subsplitting includes a communication interface and a processor. The communication interface is configured to receive from a client device a request to sort a dataset that includes a plurality of rows, where the size of the dataset is greater than a threshold size. The processor is configured to: subdivide the dataset into a plurality of data subsets; sort each of the plurality of data subsets; merge the plurality of sorted data subsets utilizing a binary merge tree to generate a sorted dataset; and provide the sorted dataset to the client device.
-
公开(公告)号:US20250131118A1
公开(公告)日:2025-04-24
申请号:US18958728
申请日:2024-11-25
Applicant: Databricks, Inc.
Inventor: Matei Zaharia , Shixiong Zhu , Xiaotong Sun , Ramesh Chandra , Michael Paul Armbrust , Ali Ghodsi
Abstract: The present application discloses a method, system, and computer system for providing access to data. The method includes receiving, by a data manager service from a data requesting service, a request using an identifier for a high-level data object to access a set of data associated with the high-level data object, determining, by the data manager service, low-level data object(s) corresponding to the set of data based on the identifier for the high-level data object, determining whether a user associated with the request has permission to access at least a subset of the low-level data object(s), and in response to determining that the user associated has permission to access the at least the subset of the low-level data object(s), generating, by the data manager service, a uniform resource locator (URL) via which the at least the subset of the one or more low-level data objects is accessible by the user.
-
-
-
-
-
-
-
-
-