Patent search ap:"Cloudera Page Inc."

41.

发明申请
CENTRALIZED CONFIGURATION OF A DISTRIBUTED COMPUTING CLUSTER 有权
Title translation: 分布式计算集群的集中配置

公开(公告)号：US20150039735A1

公开(公告)日：2015-02-05

申请号：US14509300

申请日：2014-10-08

Applicant: Cloudera, Inc.

Inventor： Philip Zeyliger , Philip Lee Langdale , Patrick David Hunt

IPC: H04L12/24 , H04L29/08

CPC classification number: H04L41/0816 , G06F9/44505 , G06F11/3051 , G06F11/3055 , G06F11/328 , G06F11/3409 , H04L41/5041 , H04L43/0817 , H04L67/16

Abstract: Systems and methods for centralized configuration of a distributed computing cluster are disclosed. One embodiment of the disclosed technology provides a user environment that facilitates a selection of a service to be run on hosts in the distributed computing cluster and configuration of the service or hosts in the distributed computer cluster. The disclosed technology can further configure each of the hosts in the distributed computing cluster to run the service based on a set of configuration settings.

Abstract translation: 公开了用于集中式配置分布式计算集群的系统和方法。所公开技术的一个实施例提供了便于选择要在分布式计算群集中的主机上运行的服务以及分布式计算机群集中的服务或主机的配置的用户环境。所公开的技术可以进一步配置分布式计算集群中的每个主机以基于一组配置设置来运行服务。

42.

发明授权
Ensuring properly ordered events in a distributed computing environment 有权

公开(公告)号：US12255978B2

公开(公告)日：2025-03-18

申请号：US18357021

申请日：2023-07-21

Applicant: Cloudera, Inc.

Inventor： David Alves , Todd Lipcon

IPC: H04L69/28 , G06F1/14 , G06F11/07 , G06Q10/00 , G06Q50/26 , H04L67/10

Abstract: A first event occurs at a first computer at a first time, as measured by a local clock. A second event is initiated at a second computer by sending a message that includes the first time. The second event occurs at a second time, as measured by a local clock. Because of clock error, the first time is later than the second time. Based on the first time being later than the second time, an alternate second time, that is based on the first time, is used as the time of the second event. When a third system determines the order of the two events, the first time is obtained from the first computer, and the alternate second time is obtained from the second computer, and the order of the events is determined based on a comparison of the two times.

43.

发明授权
Hyperparameter tuning using visual analytics in a data science platform 有权

公开(公告)号：US12248888B2

公开(公告)日：2025-03-11

申请号：US16138684

申请日：2018-09-21

Applicant: Cloudera, Inc.

Inventor： Gregorio Convertino , Tianyi Li , Haley Allen Most , Wenbo Wang , Yi-Hsun Tsai , Michael Tristan Zajonc , Michael John Lee Williams

IPC: G06N7/01 , G06F11/34 , G06F16/904 , G06N20/00

Abstract: Techniques are disclosed for facilitating the tuning of hyperparameter values during the development of machine learning (ML) models using visual analytics in a data science platform. In an example embodiment, a computer-implemented data science platform is configured to generate, and display to a user, interactive visualizations that dynamically change in response to user interaction. Using the introduced technique, a user can, for example, 1) tune hyperparameters through an iterative process using visual analytics to gain and use insights into how certain hyperparameters affect model performance and convergence, 2) leverage automation and recommendations along this process to optimize the tuning given available resources, 3) collaborate with peers, and 4) view costs associated with executing experiments during the tuning process.

44.

发明授权
Utilization-aware resource scheduling in a distributed computing cluster 有权

公开(公告)号：US12223349B2

公开(公告)日：2025-02-11

申请号：US17379742

申请日：2021-07-19

Applicant: Cloudera, Inc.

Inventor： Karthik Kambatla

IPC: G06F9/44 , G06F9/48 , G06F9/50

Abstract: Embodiments are disclosed for a utilization-aware approach to cluster scheduling, to address this resource fragmentation and to improve cluster utilization and job throughput. In some embodiments a resource manager at a master node considers actual usage of running tasks and schedules opportunistic work on underutilized worker nodes. The resource manager monitors resource usage on these nodes and preempts opportunistic containers in the event this over-subscription becomes untenable. In doing so, the resource manager effectively utilizes wasted resources, while minimizing adverse effects on regularly scheduled tasks.

45.

发明公开
SNAPSHOT COMPARISON WITH METADATA COMPACTION 审中-公开

公开(公告)号：US20230385157A1

公开(公告)日：2023-11-30

申请号：US18325853

申请日：2023-05-30

Applicant: Cloudera, Inc.

Inventor： Prashant Pogde , Siddharth Wagle , Siyao Meng , Nandakumar Vadivelu , Sadanand Shenoy

IPC: G06F11/14 , G06F16/11 , G06F16/13 , G06F16/182 , G06F16/178

CPC classification number: G06F11/1458 , G06F16/122 , G06F16/134 , G06F16/1844 , G06F16/178 , G06F2201/84

Abstract: Snapshot or point-in-time image functionality improves the use of object-based datastores. An example system includes an object-based datastore and a metadata datastore associated with the object-based datastore. Instances of the metadata datastore are created as snapshot images of the object-based datastore. Comparison of snapshot images is important for database analytics, disaster recovery, data protection, and more. Example techniques provide comparison of snapshot images (as metadata datastore instances) and remain robust and accurate in view of compactions performed by the metadata datastore. An example technique includes generating and updating a graph-based data structure that captures relationships between metadata files in the metadata datastore, particularly between pre-compaction files and post-compaction files. The example technique further includes referencing the graph-based data structure to accelerate snapshot image comparison based on determining whether files of a source snapshot image were compacted into files of a destination snapshot image, and/or vice versa.

46.

发明授权
Manifest-based snapshots in distributed computing environments 有权

公开(公告)号：US11768739B2

公开(公告)日：2023-09-26

申请号：US16943674

申请日：2020-07-30

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Matteo Bertozzi

IPC: G06F16/27 , G06F11/14

CPC classification number: G06F11/1464 , G06F16/27 , G06F11/1456 , G06F2201/84

Abstract: Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.

47.

发明授权
Background format optimization for enhanced queries in a distributed computing cluster 有权

公开(公告)号：US11630830B2

公开(公告)日：2023-04-18

申请号：US16921558

申请日：2020-07-06

Applicant: Cloudera, Inc.

Inventor： Marcel Kornacker , Justin Erickson , Nong Li , Lenni Kuff , Henry Noel Robinson , Alan Choi , Alex Behm

IPC: G06F16/2458 , G06F16/25 , G06F16/27 , G06F16/2453

Abstract: A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.

48.

发明申请
ENSURING PROPERLY ORDERED EVENTS IN A DISTRIBUTED COMPUTING ENVIRONMENT 有权

公开(公告)号：US20220382323A1

公开(公告)日：2022-12-01

申请号：US17836909

申请日：2022-06-09

Applicant: Cloudera, Inc.

Inventor： David Alves , Todd Lipcon

IPC: G06F1/14 , G06F11/07

Abstract: A first event occurs at a first computer at a first time, as measured by a local clock. A second event is initiated at a second computer by sending a message that includes the first time. The second event occurs at a second time, as measured by a local clock. Because of clock error, the first time is later than the second time. Based on the first time being later than the second time, an alternate second time, that is based on the first time, is used as the time of the second event. When a third system determines the order of the two events, the first time is obtained from the first computer, and the alternate second time is obtained from the second computer, and the order of the events is determined based on a comparison of the two times.

49.

发明申请
MUTATIONS IN A COLUMN STORE 有权

公开(公告)号：US20210271653A1

公开(公告)日：2021-09-02

申请号：US17314813

申请日：2021-05-07

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F16/22 , G06F16/23

Abstract: Columnar storage provides many performance and space saving benefits for analytic workloads, but previous mechanisms for handling single row update transactions in column stores suffer from poor performance. A columnar data layout facilitates both low-latency random access capabilities together with high-throughput analytical access capabilities, simplifying Hadoop architectures for use cases involving real-time data. In disclosed embodiments, mutations within a single row are executed atomically across columns and do not necessarily include the entirety of a row. This allows for faster updates without the overhead of reading or rewriting larger columns.

50.

发明授权
Utilization-aware resource scheduling in a distributed computing cluster 有权

公开(公告)号：US11099892B2

公开(公告)日：2021-08-24

申请号：US16797996

申请日：2020-02-21

Applicant: Cloudera, Inc.

Inventor： Karthik Kambatla

IPC: G06F9/44 , G06F9/48 , G06F9/50

Abstract: Embodiments are disclosed for a utilization-aware approach to cluster scheduling, to address this resource fragmentation and to improve cluster utilization and job throughput. In some embodiments a resource manager at a master node considers actual usage of running tasks and schedules opportunistic work on underutilized worker nodes. The resource manager monitors resource usage on these nodes and preempts opportunistic containers in the event this over-subscription becomes untenable. In doing so, the resource manager effectively utilizes wasted resources, while minimizing adverse effects on regularly scheduled tasks.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification