Data structure supporting contingency table generation

    公开(公告)号:US09940343B2

    公开(公告)日:2018-04-10

    申请号:US14571682

    申请日:2014-12-16

    Abstract: A method of converting data to tree data is provided. A first node memory structure that includes a first value indicator, a first counter value, and a first observation indicator is initialized for a first variable. The first value indicator is initialized with a first value of the first variable selected from first observation data, and the first observation indicator is initialized with a first indicator that indicates the first observation data. The first value of the first variable is compared to a second value of the first variable. The first counter value included in the first node memory structure is incremented when the first value of the first variable matches the second value of the first variable. Corresponding values of second observation data are compared to the identified values from first observation data when the first value of the first variable matches the second value of the first variable. A next observation is read from the data when the identified values match the corresponding values. The tree data is output after a last observation of the data is processed.

    Probabilistic cluster assignment
    2.
    发明授权
    Probabilistic cluster assignment 有权
    概率聚类分配

    公开(公告)号:US09367602B2

    公开(公告)日:2016-06-14

    申请号:US14924893

    申请日:2015-10-28

    Abstract: A computing device to assign observations to clusters based on a statistical probability is provided. A first cluster assignment is defined by assigning the plurality of observations to a first set of clusters. A second cluster assignment is defined by assigning the plurality of observations to a second set of clusters. A set of composite clusters is defined based on the defined first set of clusters and the defined second set of clusters. For each observation, a statistical probability value for assigning an observation to each composite cluster of the defined set of composite clusters is computed based on the first and second cluster assignments and a composite cluster assignment is defined by assigning the observation to a cluster of the set of composite clusters based on the computed statistical probability value. The defined composite cluster assignment is stored.

    Abstract translation: 提供了一种基于统计概率将观测值分配给群集的计算设备。 通过将多个观察值分配给第一组聚类来定义第一簇分配。 通过将多个观察值分配给第二组聚类来定义第二簇分配。 基于定义的第一组聚类和定义的第二组聚类来定义一组复合聚类。 对于每个观察,基于第一和第二簇分配来计算用于将观测值分配给所述定义的组合聚类集合的每个复合聚类的统计概率值,并且通过将所述观察值分配给所述集合的集合来定义复合集群分配 基于计算的统计概率值的复合集群。 定义的组合集群分配被存储。

    DATA STRUCTURE SUPPORTING CONTINGENCY TABLE GENERATION
    3.
    发明申请
    DATA STRUCTURE SUPPORTING CONTINGENCY TABLE GENERATION 有权
    数据结构支持情况表生成

    公开(公告)号:US20150324403A1

    公开(公告)日:2015-11-12

    申请号:US14571682

    申请日:2014-12-16

    Abstract: A method of converting data to tree data is provided. A first node memory structure that includes a first value indicator, a first counter value, and a first observation indicator is initialized for a first variable. The first value indicator is initialized with a first value of the first variable selected from first observation data, and the first observation indicator is initialized with a first indicator that indicates the first observation data. The first value of the first variable is compared to a second value of the first variable. The first counter value included in the first node memory structure is incremented when the first value of the first variable matches the second value of the first variable. Corresponding values of second observation data are compared to the identified values from first observation data when the first value of the first variable matches the second value of the first variable. A next observation is read from the data when the identified values match the corresponding values. The tree data is output after a last observation of the data is processed.

    Abstract translation: 提供了一种将数据转换为树数据的方法。 对于第一变量初始化包括第一值指示符,第一计数器值和第一观察指示符的第一节点存储器结构。 利用从第一观测数据选择的第一变量的第一值初始化第一值指示符,并且用指示第一观测数据的第一指示符初始化第一观测指标。 第一个变量的第一个值与第一个变量的第二个值进行比较。 当第一变量的第一值与第一变量的第二值匹配时,包括在第一节点存储器结构中的第一计数器值递增。 当第一变量的第一值与第一变量的第二值匹配时,将第二观测数据的相应值与来自第一观测数据的识别值进行比较。 当识别的值与相应的值匹配时,从数据中读取下一个观察结果。 在最后一次观察数据被处理后输出树数据。

    DETERMINATION OF COMPOSITE CLUSTERS
    5.
    发明申请
    DETERMINATION OF COMPOSITE CLUSTERS 有权
    复合聚集体的测定

    公开(公告)号:US20160048578A1

    公开(公告)日:2016-02-18

    申请号:US14924848

    申请日:2015-10-28

    Abstract: A computing device to compute composite clusters is provided. A first and a second plurality of centroid locations are computed by executing a clustering algorithm with a first portion of data and a first input parameter and a second portion of the data and a second input parameter, respectively. The first portion is different from the second portion or the first input parameter is different from the second input parameter. A plurality of composite centroid locations is computed using the computed first and second plurality of centroid locations to define a composite set of clusters. An observation is selected. A cluster of the composite set of clusters to which to assign the observation is determined using the plurality of composite centroid locations. The selecting and the determining is repeated with each observation of the plurality of observations as the observation to define cluster assignments for the plurality of observations.

    Abstract translation: 提供了一种计算复合集群的计算设备。 通过执行具有数据的第一部分和第一输入参数以及数据的第二部分和第二输入参数的聚类算法来计算第一和第二多个质心位置。 第一部分与第二部分不同,或者第一输入参数不同于第二输入参数。 使用所计算的第一和第二多个质心位置来计算多个复合质心位置以定义组合的集合。 选择观察。 使用多个复合质心位置来确定用于分配观察的组合集合的集群。 每次观察多个观察结果重复选择和确定,作为用于定义多个观测值的簇分配的观察。

    TIME SERIES CLUSTERING
    6.
    发明申请
    TIME SERIES CLUSTERING 审中-公开
    时间序列聚类

    公开(公告)号:US20150269241A1

    公开(公告)日:2015-09-24

    申请号:US14482726

    申请日:2014-09-10

    CPC classification number: G06F16/285

    Abstract: A method of transforming time series data to cluster data is provided. Time series data including a plurality of time series is received. A distance between a first time series of the plurality of time series and each of a remaining set of time series of the plurality of time series is computed pairwise between each of the remaining set of time series of the plurality of time series and the first time series. The computed values of the distance are sorted in increasing value. Gap width values are computed as a difference between successive pairs of the sorted, computed values. Whether a cluster including the received time series data is uniform is determined based on the computed gap width values. Cluster data including the first time series and the remaining set of time series assigned to the cluster is output when the cluster is determined to be uniform.

    Abstract translation: 提供了一种将时间序列数据转换为集群数据的方法。 接收包括多个时间序列的时间序列数据。 多个时间序列的第一时间序列与多个时间序列的剩余的一组时间序列之间的距离在多个时间序列的剩余的一组时间序列和第一时间序列的每一个之间成对计算 系列。 距离的计算值按增加值排序。 间隙宽度值被计算为排序的计算值的连续对之间的差。 基于所计算的间隙宽度值来确定包括接收的时间序列数据的群集是否均匀。 当群集被确定为均匀时,输出包括第一时间序列和分配给群集的剩余的时间序列集群的群集数据。

    Cluster computation using random subsets of variables
    7.
    发明授权
    Cluster computation using random subsets of variables 有权
    使用随机变量子集进行群集计算

    公开(公告)号:US09495414B2

    公开(公告)日:2016-11-15

    申请号:US14924810

    申请日:2015-10-28

    Abstract: A computing device to compute clusters using random subsets of variables is provided. Each data point of a plurality of data points is associated with a variable to define a plurality of variables. A subset of the plurality of variables is randomly selected. The subset does not include all of the plurality of variables. A number of clusters into which to segment the received data is determined. Cluster data that defines each cluster of the determined number of clusters is determined by executing a clustering algorithm with the received data using only the plurality of data points defined for each observation that are associated with the randomly selected subset of the plurality of variables. The determined cluster data is stored to cluster second data into the determined number of clusters. The second data is different from the received data.

    Abstract translation: 提供了使用变量的随机子集计算集群的计算设备。 多个数据点的每个数据点与变量相关联,以定义多个变量。 随机选择多个变量的子集。 该子集不包括所有多个变量。 确定分割接收数据的多个群集。 通过使用仅使用与多个变量的随机选择的子集相关联的每个观察点定义的多个数据点,使用所接收的数据执行聚类算法来确定确定所确定的群集数量的每个群集的群集数据。 存储所确定的集群数据以将第二数据聚类到确定数量的集群中。 第二数据与接收的数据不同。

    Graph based selection of decorrelated variables
    8.
    发明授权
    Graph based selection of decorrelated variables 有权
    基于图形的去相关变量选择

    公开(公告)号:US09489621B2

    公开(公告)日:2016-11-08

    申请号:US14928177

    申请日:2015-10-30

    Abstract: A computing device to select decorrelated variables using a graph based method is provided. A correlation value is computed between each pair of a plurality of variables to define a correlation matrix. A binary threshold value is compared to each correlation value to define a binary similarity matrix from the correlation matrix. An undirected graph comprising a subgraph that includes one or more connected nodes is defined based on the binary similarity matrix to store connectivity information for the plurality of variables. Each node of the subgraph is pairwise associated with a unique variable of the variables. (a) A least connected node is selected from the undirected graph based on the connectivity information. (b) The selected least connected node is removed from the undirected graph. (c) The connectivity information for the undirected graph is updated based on the removed node. (d) (a)-(c) are repeated until a stop criterion is satisfied.

    Abstract translation: 提供了使用基于图的方法来选择去相关变量的计算设备。 在每对变量之间计算相关值以定义相关矩阵。 将二进制阈值与每个相关值进行比较,以从相关矩阵中定义二进制相似度矩阵。 基于二进制相似度矩阵来定义包括一个或多个连接的节点的子图的无向图,以存储多个变量的连接信息。 子图的每个节点与变量的唯一变量成对关联。 (a)基于连通性信息从无向图中选择最少连接的节点。 (b)从无向图中删除所选的最不连接的节点。 (c)基于去除的节点更新无向图的连通性信息。 (d)(a) - (c),直到满足停止标准为止。

    Determination of composite clusters
    9.
    发明授权
    Determination of composite clusters 有权
    复合集群的确定

    公开(公告)号:US09471869B2

    公开(公告)日:2016-10-18

    申请号:US14924848

    申请日:2015-10-28

    Abstract: A computing device to compute composite clusters is provided. A first and a second plurality of centroid locations are computed by executing a clustering algorithm with a first portion of data and a first input parameter and a second portion of the data and a second input parameter, respectively. The first portion is different from the second portion or the first input parameter is different from the second input parameter. A plurality of composite centroid locations is computed using the computed first and second plurality of centroid locations to define a composite set of clusters. An observation is selected. A cluster of the composite set of clusters to which to assign the observation is determined using the plurality of composite centroid locations. The selecting and the determining is repeated with each observation of the plurality of observations as the observation to define cluster assignments for the plurality of observations.

    Abstract translation: 提供了一种计算复合集群的计算设备。 通过执行具有数据的第一部分和第一输入参数以及数据的第二部分和第二输入参数的聚类算法来计算第一和第二多个质心位置。 第一部分与第二部分不同,或者第一输入参数不同于第二输入参数。 使用所计算的第一和第二多个质心位置来计算多个复合质心位置以定义组合的集合。 选择观察。 使用多个复合质心位置来确定用于分配观察的组合集合的集群。 每次观察多个观察结果重复选择和确定,作为用于定义多个观测值的簇分配的观察。

    INCREMENTAL RESPONSE MODELING
    10.
    发明申请
    INCREMENTAL RESPONSE MODELING 审中-公开
    增量响应建模

    公开(公告)号:US20140372090A1

    公开(公告)日:2014-12-18

    申请号:US14199409

    申请日:2014-03-06

    CPC classification number: G06Q30/0242 G06N20/00 G06Q30/0254

    Abstract: A method of selecting a one-class support vector machine (SVM) model for incremental response modeling is provided. Exposure group data generated from first responses by an exposure group receiving a request to respond is received. Control group data generated from second responses by a control group not receiving the request to respond is received. A response is either positive or negative. A one-class SVM model is defined using the positive responses in the control group data and an upper bound parameter value. The defined one-class SVM model is executed with the identified positive responses from the exposure group data. An error value is determined based on execution of the defined one-class SVM model. A final one-class SVM model is selected by validating the defined one-class SVM model using the determined error value.

    Abstract translation: 提供了一种选择用于增量响应建模的一类支持向量机(SVM)模型的方法。 接收由接收到响应请求的曝光组的第一响应产生的曝光组数据。 接收到由未接收到响应请求的控制组从第二响应产生的控制组数据。 回应是正面或负面。 使用控制组数据中的正响应和上限参数值定义一类SVM模型。 使用来自曝光组数据的识别的正响应来执行定义的一类SVM模型。 基于定义的一类SVM模型的执行确定错误值。 通过使用确定的误差值验证定义的一类SVM模型来选择最终的一类SVM模型。

Patent Agency Ranking