-
公开(公告)号:US09990399B2
公开(公告)日:2018-06-05
申请号:US15154727
申请日:2016-05-13
Applicant: Cloudera, Inc.
Inventor: Marcel Kornacker , Justin Erickson , Nong Li , Lenni Kuff , Henry Noel Robinson , Alan Choi , Alex Behm
IPC: G06F17/30
CPC classification number: G06F17/30466 , G06F17/30442 , G06F17/30451 , G06F17/30463 , G06F17/30545 , G06F17/30569
Abstract: A low latency query engine for APACHE HADOOP™ that provides real-time or near real-time, ad hoc query capability, while completing batch-processing of MapReduce. In one embodiment, the low latency query engine comprises a daemon that is installed on data nodes in a HADOOP™ cluster for handling query requests and all internal requests related to query execution. In a further embodiment, the low latency query engine comprises a daemon for providing name service and metadata distribution. The low latency query engine receives a query request via client, turns the request into collections of plan fragments and coordinates parallel and optimized execution of the plan fragments on remote daemons to generate results at a much faster speed than existing batch-oriented processing frameworks.
-
公开(公告)号:US09977826B2
公开(公告)日:2018-05-22
申请号:US14918605
申请日:2015-10-21
Applicant: Cloudera, Inc.
Inventor: Micha Gorelick , Hilary Mason , Grant Custer
CPC classification number: G06F17/30684 , G06F17/248 , G06F17/2755 , G06F17/278 , G06F17/2881 , G06F17/30241
Abstract: A computerized method for generating and evaluating natural language-generated text involves receiving, in a computer, data input by a user, generating, using a natural language generation technique, multiple instances of text stories based upon both contents of a corpus and the received data; analyzing the multiple instances of text stories as a weighted combination of computed geographic scores, distance scores, information content scores, replacement scores and extra aspect scores, providing a ranked set of the generated text stories to a user, receiving a selection of one of the text stories in the ranked set, and storing the selected story.
-
公开(公告)号:US20180107899A1
公开(公告)日:2018-04-19
申请号:US15293679
申请日:2016-10-14
Applicant: Cloudera, Inc.
Inventor: Micha Gorelick , Hilary Mason , Grant Custer
CPC classification number: G06K9/6219 , G06F17/277 , G06F17/2818 , G06F17/289 , G06F17/3002 , G06F17/30244 , G06F17/30268 , G06F17/3028 , G06F17/30598 , G06F17/30684 , G06F17/30946 , G06K9/6201 , G06K9/6215 , G06K9/6277 , G06K9/6296 , G06K9/72
Abstract: An image processing system involves a camera, at least one processor associated with the camera, non-transitory storage, a lexical database of terms and image classification software. The image processing system uses the image classification software to assign hyponyms and associated probabilities to an image and then builds a subset hierarchical tree of hypernyms from the lexical database of terms. The processor then scores the hypernyms and identifies at least one hypernym for the image that has a score that is calculated to have a value that is greater than one of: a pre-specified threshold score, or all other calculated level scores within the subset hierarchical tree. The associated methods are also disclosed.
-
公开(公告)号:US09934382B2
公开(公告)日:2018-04-03
申请号:US14526372
申请日:2014-10-28
Applicant: Cloudera, Inc.
Inventor: Eduardo Garcia
CPC classification number: G06F21/572 , G06F9/4406 , G06F9/45558 , G06F21/575 , G06F2009/45575 , G06F2221/033 , G06F2221/2107 , G06F2221/2115
Abstract: Embodiments of the present disclosure include systems and methods for encrypting a virtual machine image and accessing an encrypted virtual machine image. According to some embodiments an encryption module can encrypt a virtual machine image and place an encryption boot loader. The encryption boot loader may be extracted from the encrypted virtual machine image, be transmitted to, and stored at a key storage system. Upon a request to boot an operating system associated with the encrypted virtual machine image, a pre-boot execution environment may communicate with an image service to retrieve the encryption boot loader from the remote key storage system. The virtual machine image may therefore be decrypted suing the encryption boot loader, which may allow booting of the operating system.
-
公开(公告)号:US20170132296A1
公开(公告)日:2017-05-11
申请号:US15345375
申请日:2016-11-07
Applicant: Cloudera, Inc.
Inventor: Yihua Ding
IPC: G06F17/30
CPC classification number: G06F17/30554
Abstract: Techniques are described for analyzing usage of data stored in a data storage system without accessing the stored data. In some embodiments, workload data indicative of queries executed at the data storage system on stored data is received. This workload data can include query logs generated during execution of the queries. The workload data is processed to identify data elements such as tables, columns, and views associated with the stored data as well as information regarding usage of the identified data elements. Usage can include operations performed on the data elements during execution of the queries. Based on this processing relationships between the identified data elements can be inferred and visualizations generated that convey information regarding usage of the data stored at the data storage system. Visualizations can include, among others, usage heatmap diagrams, join diagrams, column family diagrams, filter diagrams, view lineage diagrams, data flow diagrams, denormalization diagrams, and workload distribution diagrams.
-
公开(公告)号:US20170132283A1
公开(公告)日:2017-05-11
申请号:US15154727
申请日:2016-05-13
Applicant: Cloudera, Inc.
Inventor: Marcel Kornacker , Justin Erickson , Nong Li , Lenni Kuff , Henry Noel Robinson , Alan Choi , Alex Behm
IPC: G06F17/30
CPC classification number: G06F17/30466 , G06F17/30442 , G06F17/30451 , G06F17/30463 , G06F17/30545 , G06F17/30569
Abstract: A low latency query engine for APACHE HADOOP™ that provides real-time or near real-time, ad hoc query capability, while completing batch-processing of MapReduce. In one embodiment, the low latency query engine comprises a daemon that is installed on data nodes in a HADOOP™ cluster for handling query requests and all internal requests related to query execution. In a further embodiment, the low latency query engine comprises a daemon for providing name service and metadata distribution. The low latency query engine receives a query request via client, turns the request into collections of plan fragments and coordinates parallel and optimized execution of the plan fragments on remote daemons to generate results at a much faster speed than existing batch-oriented processing frameworks.
-
公开(公告)号:US20160350535A1
公开(公告)日:2016-12-01
申请号:US14526372
申请日:2014-10-28
Applicant: Cloudera, Inc.
Inventor: Eduardo Garcia
CPC classification number: G06F21/572 , G06F9/4406 , G06F9/45558 , G06F21/575 , G06F2009/45575 , G06F2221/033 , G06F2221/2107 , G06F2221/2115
Abstract: Embodiments of the present disclosure include systems and methods for encrypting a virtual machine image and accessing an encrypted virtual machine image. According to some embodiments an encryption module can encrypt a virtual machine image and place an encryption boot loader. The encryption boot loader may be extracted from the encrypted virtual machine image, be transmitted to, and stored at a key storage system. Upon a request to boot an operating system associated with the encrypted virtual machine image, a pre-boot execution environment may communicate with an image service to retrieve the encryption boot loader from the remote key storage system. The virtual machine image may therefore be decrypted suing the encryption boot loader, which may allow booting of the operating system.
Abstract translation: 本公开的实施例包括用于加密虚拟机映像并访问加密的虚拟机映像的系统和方法。 根据一些实施例,加密模块可加密虚拟机映像并放置加密引导加载程序。 可以从加密的虚拟机映像中提取加密引导加载程序,并将其发送到密钥存储系统并存储在密钥存储系统中。 在请求引导与加密的虚拟机映像相关联的操作系统时,预引导执行环境可以与图像服务通信以从远程密钥存储系统检索加密引导加载程序。 因此,可以对虚拟机映像进行解密,起诉加密引导加载程序,这可能允许启动操作系统。
-
38.
公开(公告)号:US20160226968A1
公开(公告)日:2016-08-04
申请号:US15098198
申请日:2016-04-13
Applicant: Cloudera, Inc.
Inventor: Jonathan Ming-Cyn Hsieh , Henry Noel Robinson
CPC classification number: G06F17/30563 , G06F11/2023 , G06F11/3476 , H04L41/12 , H04L67/1042 , H04L67/1087
Abstract: Methods for configuring a system to collect and aggregate datasets are disclosed. One embodiment includes, identifying a data source in the system from where dataset is to be collected, configuring a machine in the system that generates the dataset to be collected, to send the dataset to the data source, identifying an arrival location where the dataset that is collected is to be aggregated or written, and/or configuring an agent node by specifying a source for the agent node as the data source in the system and specifying a sink for the agent node as the arrival location.
Abstract translation: 公开了用于配置系统以收集和聚合数据集的方法。 一个实施例包括:识别系统中要从其收集数据集的数据源,在系统中配置生成要收集的数据集的机器,将数据集发送到数据源,识别数据集的到达位置, 通过将代理节点的源指定为系统中的数据源并且为代理节点指定宿令作为到达位置,来收集的集合或写入,和/或配置代理节点。
-
39.
公开(公告)号:US20150317231A1
公开(公告)日:2015-11-05
申请号:US14796812
申请日:2015-07-10
Applicant: Cloudera, Inc.
Inventor: Jonathan Ming-Cyn Hsieh , Henry Noel Robinson
CPC classification number: G06F17/30371 , G06F11/3476 , G06F11/3495 , G06F17/30575 , G06F2201/875 , H04L41/046 , H04L41/069 , H04L67/125
Abstract: Systems and methods of collecting and aggregating log data with fault tolerance are disclosed. One embodiment includes, one or more devices that generate log data, the one or more machines each associated with an agent node to collect the log data, wherein, the agent node generates a batch comprising multiple messages from the log data and assigns a tag to the batch. In one embodiment, the agent node further computes a checksum for the batch of multiple messages. The system may further include a collector device, the collector device being associated with a collector tier having a collector node to which the agent sends the log data; wherein, the collector determines the checksum for the batch of multiple messages received from the agent node.
Abstract translation: 公开了收集和聚合具有容错能力的日志数据的系统和方法。 一个实施例包括生成日志数据的一个或多个设备,每个与代理节点相关联的一个或多个机器以收集日志数据,其中,代理节点生成包括来自日志数据的多个消息的批次,并将标签分配给 批次。 在一个实施例中,代理节点还计算多个消息批次的校验和。 所述系统还可以包括收集器设备,所述收集器设备与具有所述代理发送所述日志数据的收集器节点的收集器层相关联; 其中,收集器确定从代理节点接收的多个消息的批次的校验和。
-
40.
公开(公告)号:US09128949B2
公开(公告)日:2015-09-08
申请号:US13745461
申请日:2013-01-18
Applicant: Cloudera, Inc.
Inventor: Todd Lipcon
IPC: G06F17/30
CPC classification number: G06F3/0611 , G06F3/0631 , G06F3/0644 , G06F3/0652 , G06F3/067 , G06F3/0683 , G06F12/0808 , G06F12/12 , G06F12/128 , G06F17/30138 , G06F2212/1044 , H04L67/1097
Abstract: Systems and methods of a memory allocation buffer to reduce heap fragmentation. In one embodiment, the memory allocation buffer structures a memory arena dedicated to a target region that is one of a plurality of regions in a server in a database cluster such as an HBase cluster. The memory area has a chunk size (e.g., 2 MB) and an offset pointer. Data objects in write requests targeted to the region are received and inserted to the memory arena at a location specified by the offset pointer. When the memory arena is filled, a new one is allocated. When a MemStore of the target region is flushed, the entire memory arenas for the target region are freed up. This reduces heap fragmentation that is responsible for long and/or frequent garbage collection pauses.
Abstract translation: 内存分配缓冲区的系统和方法,以减少堆碎片。 在一个实施例中,存储器分配缓冲器构造专用于数据库集群(例如HBase集群)中的服务器中的多个区域之一的目标区域的存储器竞技场。 存储器区域具有块大小(例如,2MB)和偏移指针。 接收到针对该区域的写请求中的数据对象,并将其插入到由偏移指针指定的位置的存储器场。 当记忆体被填满时,会分配一个新的记忆体。 当目标区域的MemStore被刷新时,目标区域的整个内存区域被释放。 这减少了堆碎片,这些碎片负责长时间和/或频繁的垃圾回收暂停。
-
-
-
-
-
-
-
-
-