Patent search ap:"Cloudera Page Inc."

81.

发明授权
Centralized configuration of a distributed computing cluster 有权

公开(公告)号：US09716624B2

公开(公告)日：2017-07-25

申请号：US14509300

申请日：2014-10-08

Applicant: Cloudera, Inc.

Inventor： Philip Zeyliger , Philip Lee Langdale , Patrick David Hunt

IPC: G06F15/177 , G06F15/173 , H04L12/24 , G06F9/445 , H04L29/08 , G06F11/32 , H04L12/26 , G06F11/30 , G06F11/34

CPC classification number: H04L41/0816 , G06F9/44505 , G06F11/3051 , G06F11/3055 , G06F11/328 , G06F11/3409 , H04L41/5041 , H04L43/0817 , H04L67/16

Abstract: Systems and methods for centralized configuration of a distributed computing cluster are disclosed. One embodiment of the disclosed technology provides a user environment that facilitates a selection of a service to be run on hosts in the distributed computing cluster and configuration of the service or hosts in the distributed computer cluster. The disclosed technology can further configure each of the hosts in the distributed computing cluster to run the service based on a set of configuration settings.

82.

发明授权
Manifest-based snapshots in distributed computing environments 有权

公开(公告)号：US09690671B2

公开(公告)日：2017-06-27

申请号：US14527563

申请日：2014-10-29

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Matteo Bertozzi

IPC: G06F7/00 , G06F17/00 , G06F11/14 , G06F17/30

CPC classification number: G06F11/1464 , G06F11/1456 , G06F17/30575 , G06F2201/84

Abstract: Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.

83.

发明申请
MEMORY ALLOCATION BUFFER FOR REDUCTION OF HEAP FRAGMENTATION 审中-公开

公开(公告)号：US20170131918A1

公开(公告)日：2017-05-11

申请号：US15411634

申请日：2017-01-20

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F3/06 , G06F12/128 , G06F12/0808

CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0644 , G06F3/0652 , G06F3/067 , G06F3/0683 , G06F12/0808 , G06F12/12 , G06F12/128 , G06F16/1727 , G06F2212/1044 , H04L67/1097

Abstract: Systems and methods of a memory allocation buffer to reduce heap fragmentation. In one embodiment, the memory allocation buffer structures a memory arena dedicated to a target region that is one of a plurality of regions in a server in a database cluster such as an HBase cluster. The memory area has a chunk size (e.g., 2 MB) and an offset pointer. Data objects in write requests targeted to the region are received and inserted to the memory arena at a location specified by the offset pointer. When the memory arena is filled, a new one is allocated. When a MemStore of the target region is flushed, the entire memory arenas for the target region are freed up. This reduces heap fragmentation that is responsible for long and/or frequent garbage collection pauses.

84.

发明申请
COLLECTING AND AGGREGATING LOG DATA WITH FAULT TOLERANCE 有权
Title translation: 收集和聚集日志数据与容错

公开(公告)号：US20160275136A1

公开(公告)日：2016-09-22

申请号：US15170824

申请日：2016-06-01

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Henry Noel Robinson

IPC: G06F17/30

CPC classification number: G06F17/30371 , G06F11/3476 , G06F11/3495 , G06F17/30575 , G06F2201/875 , H04L41/046 , H04L41/069 , H04L67/125

Abstract: Systems and methods of collecting and aggregating log data with fault tolerance are disclosed. One embodiment includes, one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data, wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the hatch. In one embodiment, the agent node further computes a checksum for the batch of multiple messages. The system may further include a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the hatch of multiple messages received from the agent node.

Abstract translation: 公开了收集和聚合具有容错能力的日志数据的系统和方法。一个实施例包括生成日志数据的一个或多个设备，每个与代理节点相关联的一个或多个机器以收集日志数据，其中，代理节点生成包括来自日志数据的多个消息的批次，并将标签分配给舱口盖。在一个实施例中，代理节点还计算多个消息批次的校验和。所述系统还可以包括收集器设备，所述收集器设备与具有所述代理发送所述日志数据的收集器节点的收集器层相关联; 其中，收集器确定从代理节点接收的多个消息的填充的校验和。

85.

发明申请
COMPACTION POLICY 审中-公开
Title translation: 压力政策

公开(公告)号：US20160275094A1

公开(公告)日：2016-09-22

申请号：US15073509

申请日：2016-03-17

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F17/30

CPC classification number: G06F16/278

Abstract: A compaction policy imposing soft limits to optimize system efficiency is used to select various rowsets on which to perform compaction, each rowset storing keys within an interval called a keyspace. For example, the disclosed compaction policy results in a decrease in a height of the tablet, removes overlapping rowsets, and creates smaller sized rowsets. The compaction policy is based on the linear relationship shared between the keyspace height and the cost associated with performing an operation (e.g., an insert operation) in that keyspace. Accordingly, various factors determining which rowsets are to be compacted, how large the compacted rowsets are to be made, and when to perform the compaction, are considered within the disclosed compaction policy. Furthermore, a system and method for performing compaction on the selected datasets in a log-structured database is also provided.

Abstract translation: 使用强制软件限制以优化系统效率的压缩策略来选择要执行压缩的各种行集，每个行集存储在称为密钥空间的间隔内的密钥。例如，所公开的压缩策略导致平板电脑的高度降低，去除重叠的行集，并且创建更小尺寸的行集。压缩策略基于在键空间高度与在该键空间中执行操作（例如，插入操作）相关联的成本之间共享的线性关系。因此，在所公开的压缩策略中考虑了确定哪些行集合被压缩的多种因素，压实的行集合将被制造多大以及何时执行压缩。此外，还提供了用于在日志结构化数据库中对所选数据集执行压缩的系统和方法。

86.

发明申请
DYNAMICALLY PROCESSING AN EVENT USING AN EXTENSIBLE DATA MODEL 有权

公开(公告)号：US20160070760A1

公开(公告)日：2016-03-10

申请号：US14942010

申请日：2015-11-16

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Henry Noel Robinson

IPC: G06F17/30

CPC classification number: G06F17/30516 , G06F17/30292 , H04L67/42

Abstract: Systems and methods of dynamically processing an event using an extensible data model are disclosed. One embodiment includes, specifying attributes of the event in a data model; the data model being extensible to add properties to the event as the dataset is streamed from the source to the sink.

87.

发明申请
MANIFEST-BASED SNAPSHOTS IN DISTRIBUTED COMPUTING ENVIRONMENTS 有权
Title translation: 分布式计算环境中基于显示的快照

公开(公告)号：US20150127608A1

公开(公告)日：2015-05-07

申请号：US14527563

申请日：2014-10-29

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Matteo Bertozzi

IPC: G06F17/30 , G06F11/14

CPC classification number: G06F11/1464 , G06F11/1456 , G06F17/30575 , G06F2201/84

Abstract: Scalable architectures, systems, and services are provided herein for creating manifest-based snapshots in distributed computing environments. In some embodiments, responsive to receiving a request to create a snapshot of a data object, a master node identifies multiple slave nodes on which a data object is stored in the cloud-computing platform and creates a snapshot manifest representing the snapshot of the data object. The snapshot manifest comprises a file including a listing of multiple file names in the snapshot manifest and reference information for locating the multiple files in the distributed database system. The snapshot can be created without disrupting I/O operations, e.g., in an online mode by various region servers as directed by the master node. Additionally, a log roll approach to creating the snapshot is also disclosed in which log files are marked. The replaying of log entries can reduce the probability of causal consistency in the snapshot.

Abstract translation: 本文提供了可扩展架构，系统和服务，用于在分布式计算环境中创建基于清单的快照。在一些实施例中，响应于接收到创建数据对象的快照的请求，主节点识别数据对象在其中存储在云计算平台中的多个从节点，并且创建表示数据对象的快照的快照清单。快照清单包括包含快照清单中的多个文件名的列表的文件以及用于在分布式数据库系统中定位多个文件的参考信息。可以创建快照，而不会中断由主节点指导的各种区域服务器的在线模式的I / O操作。此外，还公开了创建快照的日志滚动方法，其中标记了日志文件。日志条目的重放可以减少快照中因果一致性的概率。

88.

发明申请
BACKGROUND FORMAT OPTIMIZATION FOR ENHANCED SQL-LIKE QUERIES IN HADOOP 有权
Title translation: 背景格式优化在HADOOP中增强的类似SQL的查询

公开(公告)号：US20150095308A1

公开(公告)日：2015-04-02

申请号：US14043753

申请日：2013-10-01

Applicant: Cloudera, Inc.

Inventor： Marcel Kornacker , Justin Erickson , Nong Li , Lenni Kuff , Henry Noel Robinson , Alan Choi , Alex Behm

IPC: G06F17/30

CPC classification number: G06F17/30569 , G06F17/30283 , G06F17/30448 , G06F17/30463 , G06F17/30545 , G06F17/30595

Abstract: A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.

Abstract translation: 用于Apache Hadoop的格式转换引擎，可在某些时间点将数据从原始格式转换为数据库格式，以供低延迟（LL）查询引擎使用。格式转换引擎包括安装在Hadoop集群中每个数据节点上的守护程序。守护进程包括调度器和转换器。调度程序确定何时执行格式转换，并在时间到来时通知转换器。转换器将数据节点上的数据从其原始格式转换为数据库状格式供低延迟（LL）查询引擎使用。

89.

发明申请
DATA NODE FENCING IN A DISTRIBUTED FILE SYSTEM 有权
Title translation: 分布式文件系统中的数据节点

公开(公告)号：US20140081927A1

公开(公告)日：2014-03-20

申请号：US14024585

申请日：2013-09-11

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon , Aaron T. Myers , Eli Collins

IPC: G06F17/30

CPC classification number: G06F17/30303 , G06F11/2028 , G06F11/2038 , G06F11/2046 , G06F17/30197 , H04L41/0836

Abstract: Systems and methods for data node fencing in a distributed file system to prevent data inconsistencies and corruptions are disclosed. An embodiment includes implementing a protocol whereby data nodes detect a failover and determine an active name node based on transaction identifiers associated with transaction requests. The data nodes also provide to the active name node block location information and an acknowledgment. The embodiment further includes a protocol whereby a name node refrains from issuing invalidation requests to the data nodes until the name node receives acknowledgments from all data nodes that are functional.

Abstract translation: 公开了一种分布式文件系统中数据节点防护的系统和方法，以防止数据不一致和破坏。一个实施例包括实现协议，由此数据节点检测到故障切换，并且基于与事务请求相关联的事务标识符来确定活动名称节点。数据节点还提供活动名称节点块位置信息和确认。该实施例还包括一个协议，其中名称节点避免向数据节点发出无效请求，直到名称节点接收到来自所有功能的所有数据节点的确认。

90.

发明申请
APPROACHES TO OPTIMIZING COMPUTE RESOURCE ALLOCATION FOR HEAVY WORKLOADS IN ELASTIC ENVIRONMENTS AND CLOUD DATA PLATFORM FOR IMPLEMENTING THE SAME 有权

公开(公告)号：US20240378088A1

公开(公告)日：2024-11-14

申请号：US18450221

申请日：2023-08-15

Applicant: Cloudera, Inc.

Inventor： Sunil Govindan , Zhankun Tang , Siddharth Seth , Tarun Parimi , Vinod K. Vavilapalli , Tanu Ajmera

IPC: G06F9/50

Abstract: Introduced here is a resource management platform (also called a “resource manager”) that is able to dynamically allocate compute resources to workloads to accommodate resource requirements of a tenant in a more efficient and cost-effective manner, especially in scenarios where compute resource availability is elastic in nature. The resource manager can include a scheduling engine and a recommending engine that together are able to optimize the scaling up and down of compute resources in different scenarios. Normally, the resource manager can communicate with a resource-aware, external entity that may be responsible for implementing appropriate changes on a cloud infrastructure. For example, the external entity may be responsible for adding or removing nodes assigned to a given tenant, as well as obtaining relevant attributes of those compute resources from a provider of the cloud infrastructure.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification