METHODS AND SYSTEMS TO OPERATE ON GROUP-BY SETS WITH HIGH CARDINALITY
    2.
    发明申请
    METHODS AND SYSTEMS TO OPERATE ON GROUP-BY SETS WITH HIGH CARDINALITY 有权
    方法和系统运行在具有高度CARDINALITY的分组

    公开(公告)号:US20140330827A1

    公开(公告)日:2014-11-06

    申请号:US14270297

    申请日:2014-05-05

    CPC classification number: G06F17/30598 G06F17/30536 G06F17/30584

    Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.

    Abstract translation: 本公开描述了用于有效地计算分组统计的方法,系统,计算机可读介质和装置。 访问包含多个条目的数据集。 多个条目被分组成在两个或更多个分组变量上形成的分组子集,并且哪些子集是数据集的子集。 确定每个逐个子集的基数数据,其中基数数据表示逐个子集中的条目数。 生成每个分组子集中的数据的至少一个摘要,其中每个概要包括为逐个子集确定的基数。 初始化group-by子集的对象,使对象存储摘要。 然后可以使用对象来生成数据集的多个统计摘要。

    Analytic system for fast quantile computation with improved memory consumption strategy

    公开(公告)号:US10311128B2

    公开(公告)日:2019-06-04

    申请号:US16140931

    申请日:2018-09-25

    Abstract: A computing device computes a quantile value. A maximum value and a minimum value are computed for unsorted variable values to compute an upper bin value and a lower bin value for each bin of a plurality of bins. A frequency counter is computed for each bin by reading the unsorted variable values a second time. A bin number and a cumulative rank value are computed for a quantile. When an estimated memory usage value exceeds a predefined memory size constraint value, a subset of the plurality of bins are split into a plurality of bins, the frequency counter is recomputed for each bin, and the bin number and the cumulative rank value are recomputed. Frequency data is computed using the frequency counters. The quantile value is computed using the frequency data and the cumulative rank value for the quantile and output.

    Analytic system for fast quantile computation

    公开(公告)号:US10127192B1

    公开(公告)日:2018-11-13

    申请号:US15961373

    申请日:2018-04-24

    Abstract: A computing device computes a quantile value. A maximum value and a minimum value are computed for unsorted variable values. An upper bin value and a lower bin value are computed for each bin of a plurality of bins using the maximum and minimum values. A frequency counter is computed for each bin by reading the unsorted variable values a second time. Each frequency counter is a count of the variable values within a respective bin. A bin number and a cumulative rank value are computed for a quantile. The bin number identifies a specific within which a quantile value associated with the quantile is located. The cumulative rank value identifies a cumulative rank for the quantile value associated with the quantile. Frequency data is computed using the frequency counters. The quantile value is computed using the frequency data and the cumulative rank value for the quantile and output.

    Methods and systems to operate on group-by sets with high cardinality

    公开(公告)号:US09633104B2

    公开(公告)日:2017-04-25

    申请号:US14270297

    申请日:2014-05-05

    CPC classification number: G06F17/30598 G06F17/30536 G06F17/30584

    Abstract: This disclosure describes methods, systems, computer-readable media, and apparatuses for efficiently calculating group-by statistics. A data set that includes multiple entries is accessed. The multiple entries are grouped into group-by subsets which are formed on two or more group-by variables and which are subsets are subsets of the data set. Cardinality data is determined for each of the group-by subsets, wherein cardinality data represents a number of entries in a group-by subset. At least one summary of data in each of the group-by subsets is generated, wherein each of the summaries includes the cardinality data determined for the group-by subset. Objects for the group-by subsets are initialized such that the objects store the summaries. The objects may then be used to generate multiple statistical summaries of the data set.

    OBJECT AND DATA POINT TRACKING TO CONTROL SYSTEM IN OPERATION

    公开(公告)号:US20210004953A1

    公开(公告)日:2021-01-07

    申请号:US16863093

    申请日:2020-04-30

    Abstract: A computing system obtains image data capturing first and second objects. The system determines, based on user-identified data points, boundaries of the objects and generates a component of a dataset by computing a first data value related to an attribute of a key point in the first image; and computing a second data value related to an attribute of a key point in the first image. The system generates a second component of the dataset, the second component representing updated relative information between the first and second object by generating predicted changes in the first data value and second data value for the second image. The system computes a third data value and a fourth data value related to respective data points in a first and second polygon in the second image. The generating the updated relative information is based on the predicted changes and computed values.

    ANALYTIC SYSTEM FOR FAST QUANTILE COMPUTATION WITH IMPROVED MEMORY CONSUMPTION STRATEGY

    公开(公告)号:US20190129919A1

    公开(公告)日:2019-05-02

    申请号:US16140931

    申请日:2018-09-25

    Abstract: A computing device computes a quantile value. A maximum value and a minimum value are computed for unsorted variable values to compute an upper bin value and a lower bin value for each bin of a plurality of bins. A frequency counter is computed for each bin by reading the unsorted variable values a second time. A bin number and a cumulative rank value are computed for a quantile. When an estimated memory usage value exceeds a predefined memory size constraint value, a subset of the plurality of bins are split into a plurality of bins, the frequency counter is recomputed for each bin, and the bin number and the cumulative rank value are recomputed. Frequency data is computed using the frequency counters. The quantile value is computed using the frequency data and the cumulative rank value for the quantile and output.

Patent Agency Ranking