Patent search ap:"Cloudera Page Inc."

11.

发明授权
Utilization-aware resource scheduling in a distributed computing cluster 有权

公开(公告)号：US10572306B2

公开(公告)日：2020-02-25

申请号：US15595713

申请日：2017-05-15

Applicant: Cloudera, Inc.

Inventor： Karthik Kambatla

IPC: G06F9/50 , G06F9/48

Abstract: Embodiments are disclosed for a utilization-aware approach to cluster scheduling, to address this resource fragmentation and to improve cluster utilization and job throughput. In some embodiments a resource manager at a master node considers actual usage of running tasks and schedules opportunistic work on underutilized worker nodes. The resource manager monitors resource usage on these nodes and preempts opportunistic containers in the event this over-subscription becomes untenable. In doing so, the resource manager effectively utilizes wasted resources, while minimizing adverse effects on regularly scheduled tasks.

12.

发明申请
DESIGN-TIME INFORMATION BASED ON RUN-TIME ARTIFACTS IN TRANSIENT CLOUD-BASED DISTRIBUTED COMPUTING CLUSTERS 审中-公开

公开(公告)号：US20190138654A1

公开(公告)日：2019-05-09

申请号：US15943603

申请日：2018-04-02

Applicant: Cloudera, Inc.

Inventor： Sudhanshu Arora , Mark Donsky , Guang Yao Leng , Naren Koneru , Chang She , Vikas Singh , Himabindu Vuppula

IPC: G06F17/30 , G06N5/04 , G06F9/455

Abstract: Transient computing clusters can be temporarily provisioned in cloud-based infrastructure to run data processing tasks. Such tasks may be run by services operating in the clusters that consume and produce data including operational metadata. Techniques are introduced for tracking data lineage across multiple clusters, including transient computing clusters, based on the operational metadata. In some embodiments, operational metadata is extracted from the transient computing clusters and aggregated at a metadata system for analysis. Based on the analysis of the metadata, operations can be summarized at a cluster level even if the transient computing cluster no longer exists. Further relationships between workflows, such as dependencies or redundancies, can be identified and utilized to optimize the provisioning of computing clusters and tasks performed by the computing clusters.

13.

发明申请
ENSURING PROPERLY ORDERED EVENTS IN A DISTRIBUTED COMPUTING ENVIRONMENT 审中-公开

公开(公告)号：US20190109930A1

公开(公告)日：2019-04-11

申请号：US16198677

申请日：2018-11-21

Applicant: Cloudera, Inc.

Inventor： David Alves , Todd Lipcon

IPC: H04L29/06 , G06Q50/26 , G06Q10/00 , H04L29/08 , G06F1/14

CPC classification number: H04L69/28 , G06F1/14 , G06Q10/00 , G06Q50/26 , H04L67/10

Abstract: A first event occurs at a first computer at a first time, as measured by a local clock. A second event is initiated at a second computer by sending a message that includes the first time. The second event occurs at a second time, as measured by a local clock. Because of clock error, the first time is later than the second time. Based on the first time being later than the second time, an alternate second time, that is based on the first time, is used as the time of the second event. When a third system determines the order of the two events, the first time is obtained from the first computer, and the alternate second time is obtained from the second computer, and the order of the events is determined based on a comparison of the two times.

14.

发明授权
Configuring a system to collect and aggregate datasets 有权

公开(公告)号：US10187461B2

公开(公告)日：2019-01-22

申请号：US15098198

申请日：2016-04-13

Applicant: Cloudera, Inc.

Inventor： Jonathan Ming-Cyn Hsieh , Henry Noel Robinson

IPC: H04L29/08 , G06F17/30 , G06F11/20 , H04L12/24 , G06F11/34

Abstract: Methods for configuring a system to collect and aggregate datasets are disclosed. One embodiment includes, identifying a data source in the system from where dataset is to be collected, configuring a machine in the system that generates the dataset to be collected, to send the dataset to the data source, identifying an arrival location where the dataset that is collected is to be aggregated or written, and/or configuring an agent node by specifying a source for the agent node as the data source in the system and specifying a sink for the agent node as the arrival location.

15.

发明授权
Data node fencing in a distributed file system 有权

公开(公告)号：US09753954B2

公开(公告)日：2017-09-05

申请号：US14024585

申请日：2013-09-11

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon , Aaron T. Myers , Eli Collins

IPC: G06F17/30 , G06F15/16 , G06F11/20 , H04L12/24

CPC classification number: G06F17/30303 , G06F11/2028 , G06F11/2038 , G06F11/2046 , G06F17/30197 , H04L41/0836

Abstract: Systems and methods for data node fencing in a distributed file system to prevent data inconsistencies and corruptions are disclosed. An embodiment includes implementing a protocol whereby data nodes detect a failover and determine an active name node based on transaction identifiers associated with transaction requests. The data nodes also provide to the active name node block location information and an acknowledgment. The embodiment further includes a protocol whereby a name node refrains from issuing invalidation requests to the data nodes until the name node receives acknowledgments from all data nodes that are functional.

16.

发明授权
Data processing performance enhancement in a distributed file system 有权

公开(公告)号：US09600492B2

公开(公告)日：2017-03-21

申请号：US15225533

申请日：2016-08-01

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F7/00 , G06F17/30 , G06F12/08 , G06F13/20

CPC classification number: G06F17/30194 , G06F12/0804 , G06F12/0866 , G06F12/0871 , G06F13/20 , G06F17/30203 , G06F2212/214 , G06F2212/603

Abstract: Systems and methods of data processing performance enhancement are disclosed. One embodiment includes, invoking operating system calls to optimize cache management by an I/O component; wherein, the operating system calls are invoked to perform one or more of; proactive triggering of readaheads for sequential read requests of a disk; purging data out of buffer cache after writing to the disk or performing sequential reads from the desk; and/or eliminating a delay between when a write is performed and when written data from the write is flushed to the disk from the buffer cache.

17.

发明授权
Memory allocation buffer for reduction of heap fragmentation 有权
Title translation: 用于减少堆碎片的内存分配缓冲区

公开(公告)号：US09552165B2

公开(公告)日：2017-01-24

申请号：US14846413

申请日：2015-09-04

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F17/30 , G06F3/06 , H04L29/08

CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0644 , G06F3/0652 , G06F3/067 , G06F3/0683 , G06F12/0808 , G06F12/12 , G06F12/128 , G06F17/30138 , G06F2212/1044 , H04L67/1097

Abstract: Systems and methods of a memory allocation buffer to reduce heap fragmentation. In one embodiment, the memory allocation buffer structures a memory arena dedicated to a target region that is one of a plurality of regions in a server in a database cluster such as an HBase cluster. The memory area has a chunk size (e.g., 2 MB) and an offset pointer. Data objects in write requests targeted to the region are received and inserted to the memory arena at a location specified by the offset pointer. When the memory arena is filled, a new one is allocated. When a MemStore of the target region is flushed, the entire memory arenas for the target region are freed up. This reduces heap fragmentation that is responsible for long and/or frequent garbage collection pauses.

Abstract translation: 内存分配缓冲区的系统和方法，以减少堆碎片。在一个实施例中，存储器分配缓冲器构造专用于数据库集群（例如HBase集群）中的服务器中的多个区域之一的目标区域的存储器竞技场。存储器区域具有块大小（例如，2MB）和偏移指针。接收到针对该区域的写请求中的数据对象，并将其插入到由偏移指针指定的位置的存储器场。当记忆体被填满时，会分配一个新的记忆体。当目标区域的MemStore被刷新时，目标区域的整个内存区域被释放。这减少了堆碎片，这些碎片负责长时间和/或频繁的垃圾回收暂停。

18.

发明授权
Background format optimization for enhanced SQL-like queries in Hadoop 有权
Title translation: Hadoop中增强型SQL查询的背景格式优化

公开(公告)号：US09477731B2

公开(公告)日：2016-10-25

申请号：US14043753

申请日：2013-10-01

Applicant: Cloudera, Inc.

Inventor： Marcel Kornacker , Justin Erickson , Nong Li , Lenni Kuff , Henry Noel Robinson , Alan Choi , Alex Behm

IPC: G06F17/30

CPC classification number: G06F17/30569 , G06F17/30283 , G06F17/30448 , G06F17/30463 , G06F17/30545 , G06F17/30595

Abstract: A format conversion engine for Apache Hadoop that converts data from its original format to a database-like format at certain time points for use by a low latency (LL) query engine. The format conversion engine comprises a daemon that is installed on each data node in a Hadoop cluster. The daemon comprises a scheduler and a converter. The scheduler determines when to perform the format conversion and notifies the converter when the time comes. The converter converts data on the data node from its original format to a database-like format for use by the low latency (LL) query engine.

Abstract translation: 用于Apache Hadoop的格式转换引擎，可在某些时间点将数据从原始格式转换为数据库格式，以供低延迟（LL）查询引擎使用。格式转换引擎包括安装在Hadoop集群中每个数据节点上的守护程序。守护进程包括调度器和转换器。调度程序确定何时执行格式转换，并在时间到来时通知转换器。转换器将数据节点上的数据从其原始格式转换为数据库状格式供低延迟（LL）查询引擎使用。

19.

发明申请
MEMORY ALLOCATION BUFFER FOR REDUCTION OF HEAP FRAGMENTATION 有权
Title translation: 记忆分配缓冲区用于减少分组分段

公开(公告)号：US20150378618A1

公开(公告)日：2015-12-31

申请号：US14846413

申请日：2015-09-04

Applicant: Cloudera, Inc.

Inventor： Todd Lipcon

IPC: G06F3/06 , H04L29/08

CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0644 , G06F3/0652 , G06F3/067 , G06F3/0683 , G06F12/0808 , G06F12/12 , G06F12/128 , G06F17/30138 , G06F2212/1044 , H04L67/1097

Abstract: Systems and methods of a memory allocation buffer to reduce heap fragmentation. In one embodiment, the memory allocation buffer structures a memory arena dedicated to a target region that is one of a plurality of regions in a server in a database cluster such as an HBase cluster. The memory area has a chunk size (e.g., 2 MB) and an offset pointer. Data objects in write requests targeted to the region are received and inserted to the memory arena at a location specified by the offset pointer. When the memory arena is filled, a new one is allocated. When a MemStore of the target region is flushed, the entire memory arenas for the target region are freed up. This reduces heap fragmentation that is responsible for long and/or frequent garbage collection pauses.

Abstract translation: 内存分配缓冲区的系统和方法，以减少堆碎片。在一个实施例中，存储器分配缓冲器构造专用于数据库集群（例如HBase集群）中的服务器中的多个区域之一的目标区域的存储器竞技场。存储器区域具有块大小（例如，2MB）和偏移指针。接收到针对该区域的写请求中的数据对象，并将其插入到由偏移指针指定的位置的存储器场。当记忆体被填满时，会分配一个新的记忆体。当目标区域的MemStore被刷新时，目标区域的整个内存区域被释放。这减少了堆碎片，这些碎片负责长时间和/或频繁的垃圾回收暂停。

20.

发明申请
METHODS AND APPARATUS TO ORGANIZE AN OBJECT STORE NAMESPACE 有权

公开(公告)号：US20250117397A1

公开(公告)日：2025-04-10

申请号：US18535818

申请日：2023-12-11

Applicant: Cloudera, Inc.

Inventor： Prashant Pogde , Siddharth Jivan Wagle , Uma Maheswara Rao Gangumalla , Arpit Ashok Agarwal

IPC: G06F16/27

Abstract: Disclosed examples create at least first and second database shards in a leader node, the leader node located in a consensus ring; and cause replication of first namespace metadata in the at least the first and second database shards of the leader node and in at least first and second database shards in a follower node, the follower node located in the consensus ring.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification