Semi-supervised classification system

    公开(公告)号:US11200514B1

    公开(公告)日:2021-12-14

    申请号:US17342825

    申请日:2021-06-09

    Inventor: Xu Chen Xinmin Wu

    Abstract: Unclassified observations are classified. Similarity values are computed for each unclassified observation and for each target variable value. A confidence value is computed for each unclassified observation using the similarity values. A high-confidence threshold value and a low-confidence threshold value are computed from the confidence values. For each observation, when the confidence value is greater than the high-confidence threshold value, the observation is added to a training dataset and, when the confidence value is greater than the low-confidence threshold value and less than the high-confidence threshold value, the observation is added to the training dataset based on a comparison between a random value drawn from a uniform distribution and an inclusion percentage value. A classification model is trained with the training dataset and classified observations. The trained classification model is executed with the unclassified observations to determine a label assignment.

    Distributable event prediction and machine learning recognition system

    公开(公告)号:US11010691B1

    公开(公告)日:2021-05-18

    申请号:US17093917

    申请日:2020-11-10

    Abstract: Data is classified using semi-supervised data. A decomposition is performed to define a first decomposition matrix that includes first eigenvectors of a weight matrix, a second decomposition matrix that includes second eigenvectors of a transpose of the weight matrix, and a diagonal matrix that includes eigenvalues of the first eigenvectors. Eigenvectors are selected from the first eigenvectors to define a reduced decomposition matrix. A linear transformation matrix is computed as a function of the first decomposition matrix, the reduced decomposition matrix, the diagonal matrix, and a penalty matrix. When a rank of the linear transformation matrix is less than a number of rows of the penalty matrix, a classification matrix is computed by updating a gradient of a cost function. When the rank of the linear transformation matrix is equal to the number of rows of the penalty matrix, the classification matrix is computed using a dual formulation.

    DISTRIBUTED EVENT PREDICTION AND MACHINE LEARNING OBJECT RECOGNITION SYSTEM

    公开(公告)号:US20180053071A1

    公开(公告)日:2018-02-22

    申请号:US15686863

    申请日:2017-08-25

    Inventor: Xu Chen Tao Wang

    Abstract: A computing device predicts occurrence of an event or classifies an object using distributed unlabeled data. Supervised data that includes a labeled subset of a plurality of observation vectors is identified. A total number of threads that will perform labeling of an unlabeled subset of the plurality of observation vectors is determined. The identified supervised data is uploaded to each thread of the total number of threads. Unlabeled observation vectors are randomly select from the unlabeled subset of the plurality of observation vectors to allocate to each thread of the total number of threads. The randomly selected, unlabeled observation vectors are uploaded to each thread of the total number of threads based on the allocation. The value of the target variable for each observation vector of the unlabeled subset of the plurality of observation vectors is determined based on a converged classification matrix and output to a labeled dataset.

    Event prediction and object recognition system

    公开(公告)号:US09792562B1

    公开(公告)日:2017-10-17

    申请号:US15335530

    申请日:2016-10-27

    Inventor: Xu Chen Tao Wang

    CPC classification number: G06N99/005 G06N5/003 G06N7/005

    Abstract: A computing device predicts occurrence of an event or classifies an object using semi-supervised data. A label set defines permissible values for a target variable. A value of the permissible values is defined for a subset of observation vectors. A predefined number of times, a distance matrix is computed that defines a distance value between pairs of observation vectors using a distance function and a converged classification matrix; a number of observation vectors is selected that have minimum values for the distance value; a label is requested and a response is received for each of the selected observation vectors; the value of the target variable is updated for each of the selected observation vectors with the received response; and the value of the target variable is determined again by recomputing the converged classification matrix. The value of the target variable for each observation vector is output to a second dataset.

    Machine learning classification system

    公开(公告)号:US11379685B2

    公开(公告)日:2022-07-05

    申请号:US17386706

    申请日:2021-07-28

    Inventor: Xu Chen

    Abstract: A computing device classifies unclassified observations. A first batch of unclassified observation vectors and a first batch of classified observation vectors are selected. A prior regularization error value and a decoder reconstruction error value are computed. A first batch of noise observation vectors is generated. An evidence lower bound (ELBO) value is computed. A gradient of an encoder neural network model is computed, and the ELBO value is updated. A decoder neural network model and an encoder neural network model are updated. The decoder neural network model is trained. The target variable value is determined for each observation vector of the unclassified observation vectors based on an output of the trained decoder neural network model. The target variable value is output.

    Distributable event prediction and machine learning recognition system

    公开(公告)号:US10929762B1

    公开(公告)日:2021-02-23

    申请号:US16940501

    申请日:2020-07-28

    Abstract: Data is classified using corrected semi-supervised data. Cluster centers are defined for unclassified observations. A class is determined for each cluster. A distance value is computed between a classified observation and each cluster center. When the class of the classified observation is not the class determined for the cluster center having a minimum distance, a first distance value is selected as the minimum distance, a second distance value is selected as the distance value computed to the cluster center having the class of the classified observation, a ratio value is computed between the second distance value and the first distance value, and the class of the classified observation is changed to the class determined for the cluster center having the minimum distance value when the computed ratio value satisfies a label correction threshold. A classification matrix is defined using corrected observations to determine the class for the unclassified observations.

    Distributed hyperparameter tuning system for active machine learning

    公开(公告)号:US10832174B1

    公开(公告)日:2020-11-10

    申请号:US16816382

    申请日:2020-03-12

    Abstract: Data is classified using automatically selected hyperparameter values. (A) A first loss value is determined based on a converged classification matrix. (B) Each observation vector is assigned to a cluster using a clustering algorithm based on the converged classification matrix. (C) A predefined number of observation vectors is selected from each cluster. D) Classified observation vectors and unclassified observation vectors are updated based on the selections in (C) and (A) is repeated. (E) An entropy loss value is determined, wherein (A) to (E) are repeated for a plurality of different values of a kernel parameter value and a batch size value. (F) A second loss value is determined based on the converged classification matrix, a label matrix defined from the converged classification matrix, and a weight value. (L) (A) to (F) are repeated with a plurality of different values of the weight value until convergence is satisfied.

    DISTRIBUTABLE EVENT PREDICTION AND MACHINE LEARNING RECOGNITION SYSTEM

    公开(公告)号:US20200151603A1

    公开(公告)日:2020-05-14

    申请号:US16706912

    申请日:2019-12-09

    Inventor: Xu Chen

    Abstract: A computing device predicts occurrence of an event or classifies an object using distributed unlabeled data. A Laplacian matrix is computed using a kernel function. A predefined number of eigenvectors is selected from a decomposed Laplacian matrix to define a decomposition matrix. A gradient value is computed as a function of the defined decomposition matrix, a plurality of sparse coefficients, and a label matrix, a value of each coefficient of the plurality of sparse coefficients is updated based on the computed gradient value, and the computations are repeated until a convergence parameter value indicates the plurality of sparse coefficients have converged. A classification matrix is defined using the plurality of sparse coefficients to determine the target variable value for each observation vector of the plurality of unclassified observation vectors. The target variable value for each observation vector of the plurality of unclassified observation vectors is output.

    MACHINE LEARNING PREDICTIVE LABELING SYSTEM
    9.
    发明申请

    公开(公告)号:US20190050368A1

    公开(公告)日:2019-02-14

    申请号:US16162794

    申请日:2018-10-17

    Abstract: A computing device automatically classifies an observation vector. A label set defines permissible values for a target variable. Supervised data includes a labeled subset that has one of the permissible values. A converged classification matrix is computed based on the supervised data and an unlabeled subset using a prior class distribution matrix that includes a row for each observation vector. Each column is associated with a single permissible value of the label set. A cell value in each column is a likelihood that each associated permissible value of the label set occurs based on prior class distribution information. The value of the target variable is selected using the converged classification matrix. A weighted classification label distribution matrix is computed from the converged classification matrix. The value of the target variable for each observation vector of the plurality of observation vectors is output to a labeled dataset.

    DISTRIBUTABLE EVENT PREDICTION AND MACHINE LEARNING RECOGNITION SYSTEM

    公开(公告)号:US20210287116A1

    公开(公告)日:2021-09-16

    申请号:US17178798

    申请日:2021-02-18

    Abstract: Data is classified using semi-supervised data. Sparse coefficients are computed using a decomposition of a Laplacian matrix. (B) Updated parameter values are computed for a dimensionality reduction method using the sparse coefficients, the Laplacian matrix, and a plurality of observation vectors. The updated parameter values include a robust estimator of a decomposition matrix determined from the decomposition of the Laplacian matrix. (B) is repeated until a convergence parameter value indicates the updated parameter values for the dimensionality reduction method have converged. A classification matrix is defined using the sparse coefficients and the robust estimator of the decomposition of the Laplacian matrix. The target variable value is determined for each observation vector based on the classification matrix. The target variable value is output for each observation vector of the plurality of unclassified observation vectors and is defined to represent a label for a respective unclassified observation vector.

Patent Agency Ranking