Distributed event prediction and machine learning object recognition system

    公开(公告)号:US10127477B2

    公开(公告)日:2018-11-13

    申请号:US15686863

    申请日:2017-08-25

    Inventor: Xu Chen Tao Wang

    Abstract: A computing device predicts occurrence of an event or classifies an object using distributed unlabeled data. Supervised data that includes a labeled subset of a plurality of observation vectors is identified. A total number of threads that will perform labeling of an unlabeled subset of the plurality of observation vectors is determined. The identified supervised data is uploaded to each thread of the total number of threads. Unlabeled observation vectors are randomly select from the unlabeled subset of the plurality of observation vectors to allocate to each thread of the total number of threads. The randomly selected, unlabeled observation vectors are uploaded to each thread of the total number of threads based on the allocation. The value of the target variable for each observation vector of the unlabeled subset of the plurality of observation vectors is determined based on a converged classification matrix and output to a labeled dataset.

    DISTRIBUTED DATA VARIABLE ANALYSIS AND HIERARCHICAL GROUPING SYSTEM

    公开(公告)号:US20180300650A1

    公开(公告)日:2018-10-18

    申请号:US15876723

    申请日:2018-01-22

    Abstract: A computing system provides analysis of data and grouping of variables in support of analytics. From a plurality of observation vectors read from a dataset, a number of observations having a non-missing value and a cardinality value are computed for each variable of the variables. For each variable of the variables, the cardinality ratio value is compared to a first policy parameter value, and the respective variable is identified as a nominal variable type or as an interval variable type based on the comparison. For each variable of the variables identified as the nominal variable type, the cardinality value of the respective variable is compared to a second policy parameter value, and the respective variable is identified as the high-cardinality nominal variable type or as a non-high-cardinality nominal variable type based on the comparison with the cardinality value. The identified variable type is output for each variable of the variables.

    Techniques for generating a clustered representation of a network based on node data

    公开(公告)号:US10095693B2

    公开(公告)日:2018-10-09

    申请号:US14709601

    申请日:2015-05-12

    Abstract: An apparatus includes a communications component to receive a specified variable and one or more specified criteria to select a final clustered representation of a network, the specified criteria including a maximum degree of loss of information for the specified variable for the final clustered representation; and an iterative collapse component to perform iteration(s) of deriving the final clustered representation. Each iteration includes calculating the degree of loss from each possible combination of two linked nodes of a current clustered representation to generate a next clustered representation; selecting the possible combination associated with a smallest degree of loss; determining whether to cease iterations based on whether the smallest degree associated with the selected combination exceeds the maximum degree; effecting the selected combination if the smallest degree doesn't exceed the maximum degree; and selecting the current clustered representation as the final clustered representation if the smallest degree exceeds the maximum degree.

    Stable data-processing in a distributed computing environment

    公开(公告)号:US10044505B2

    公开(公告)日:2018-08-07

    申请号:US15677683

    申请日:2017-08-15

    Inventor: Gang Meng

    Abstract: A node in a distributed computing environment can generate key-value pairs. The node can categorize the key-value pairs into bins, with each key-value pair being categorized into a bin spanning a range of hashed keys that includes a hashed key of the key-value pair. The node can determine nodes in the distributed computing environment that are mapped to the bins. The node can distribute each key-value pair to a node corresponding to a bin into which the key-value pair was categorized. The node can then sort any of the key-value pairs maintained on the node by hashed key or key to generate sorted key-value pairs. The node can assign index values to the sorted key-value pairs. The indexed key-value pairs may be the same each time the above process is run, regardless of the underlying topology of the distributed computing environment. This can result in stable data-processing.

    ANALYTIC SYSTEM FOR FAST QUANTILE REGRESSION COMPUTATION

    公开(公告)号:US20180181541A1

    公开(公告)日:2018-06-28

    申请号:US15849870

    申请日:2017-12-21

    Inventor: Yonggang Yao

    CPC classification number: G06F17/11 G06F17/18

    Abstract: A computing device computes a plurality of quantile regression solvers for a dataset at a plurality of quantile levels. Each observation vector includes an explanatory vector of a plurality of explanatory variable values and a response variable value. The read dataset is recursively divided into subsets of the plurality of observation vectors, a lower counterweight vector and an upper counterweight vector are computed for each of the subsets, and a quantile regression solver is fit to each of the subsets using the associated, computed lower counterweight vector and the associated, computed upper counterweight vector to describe a quantile function of the response variable values for a selected quantile level of the identified plurality of quantile level values. For each selected quantile level, a parameter estimate vector and a dual solution vector that describe the quantile function are output in association with the selected quantile level.

    AUTOMATED TRANSFER OF OBJECTS AMONG FEDERATED AREAS

    公开(公告)号:US20180181445A1

    公开(公告)日:2018-06-28

    申请号:US15896727

    申请日:2018-02-14

    CPC classification number: G06F9/5083 G06F17/30949 G06F17/30985

    Abstract: An apparatus includes a processor to: receive, from a first remote device, a request to perform at least one iteration of a first job flow at least partly within a first federated area, wherein access to the first federated area is granted to the first remote device and not a second remote device, access to a second federated area is granted to the second remote device and not the first remote device, and a transfer area is maintained to transfer an object between the first and second federated areas; perform the at least one iteration of the first job flow; and analyze an output object generated in each iteration to determine whether a condition has been met to transfer an object from the first federated area to the transfer area to enable its transfer to the second federated area to enable its use in a second job flow.

Patent Agency Ranking