BIAS MITIGATING MACHINE LEARNING TRAINING SYSTEM

    公开(公告)号:US20230205839A1

    公开(公告)日:2023-06-29

    申请号:US18051906

    申请日:2022-11-02

    CPC classification number: G06F17/16 G06F17/18

    Abstract: A computing device trains a fair machine learning model. A predicted target variable is defined using a trained prediction model. The prediction model is trained with weighted observation vectors. The predicted target variable is updated using the prediction model trained with weighted observation vectors. A true conditional moments matrix and a false conditional moments matrix are computed. The training and updating with weighted observation vectors are repeated until a number of iterations is performed. When a computed conditional moments matrix indicates to adjust a bound value, the bound value is updated based on an upper bound value or a lower bound value, and the repeated training and updating with weighted observation vectors is repeated with the bound value replaced with the updated bound value until the conditional moments matrix indicates no further adjustment of the bound value is needed. A fair prediction model is trained with the updated bound value.

    Semi-supervised classification system

    公开(公告)号:US11200514B1

    公开(公告)日:2021-12-14

    申请号:US17342825

    申请日:2021-06-09

    Inventor: Xu Chen Xinmin Wu

    Abstract: Unclassified observations are classified. Similarity values are computed for each unclassified observation and for each target variable value. A confidence value is computed for each unclassified observation using the similarity values. A high-confidence threshold value and a low-confidence threshold value are computed from the confidence values. For each observation, when the confidence value is greater than the high-confidence threshold value, the observation is added to a training dataset and, when the confidence value is greater than the low-confidence threshold value and less than the high-confidence threshold value, the observation is added to the training dataset based on a comparison between a random value drawn from a uniform distribution and an inclusion percentage value. A classification model is trained with the training dataset and classified observations. The trained classification model is executed with the unclassified observations to determine a label assignment.

    Neural network training system
    13.
    发明授权

    公开(公告)号:US11195084B1

    公开(公告)日:2021-12-07

    申请号:US17198737

    申请日:2021-03-11

    Abstract: A computing device trains a neural network machine learning model. A forward propagation of a first neural network is executed. A backward propagation of the first neural network is executed from a last layer to a last convolution layer of a plurality of convolutional layers to compute a gradient vector for first weight values of the last convolution layer using observation vectors. A discriminative localization map is computed for each observation vector with the gradient vector using a discriminative localization map function. A forward and a backward propagation of a second neural network is executed to compute a second weight value for each neuron of the second neural network using the discriminative localization map computed for each observation vector. A predefined number of iterations of the forward and the backward propagation of the second neural network is repeated.

    Analytic system for machine learning prediction model selection

    公开(公告)号:US10417528B2

    公开(公告)日:2019-09-17

    申请号:US16059241

    申请日:2018-08-09

    Abstract: An assessment dataset is selected from an input dataset using a first stratified sampling process based on a value of an event assessment variable. A remainder of the input dataset is allocated to a training/validation dataset that is partitioned into an oversampled training/validation dataset using an oversampling process based on a predefined value of the event assessment variable. A validation sample is selected from the oversampled training/validation dataset using a second stratified sampling process based on the value of the event assessment variable. A training sample is selected from the oversampled training/validation dataset using the second stratified sampling process based on the value of the event assessment variable. The validation sample and the training sample are mutually exclusive. A predictive type model is trained using the selected training sample. A plurality of predictive type models are trained, validated, and scored using the samples to select a best predictive model.

    ANALYTIC SYSTEM FOR STREAMING QUANTILE COMPUTATION

    公开(公告)号:US20190258697A1

    公开(公告)日:2019-08-22

    申请号:US16398690

    申请日:2019-04-30

    Abstract: A computing device computes a quantile value for a variable value extracted from an event block object by computing a bin number for the variable value. If the computed bin number is between a before bin number and an after bin number computed for a quantile, the quantile is identified. Frequency data is updated to include the extracted variable value as a key value. A frequency value associated with the key value indicates a number of occurrences of the variable value in previously processed data. A cumulative rank value of the identified quantile is updated. A quantile adjustment value is computed based on a comparison between the variable value and a current quantile value of the identified quantile. An updated quantile value associated with the identified quantile is computed using the updated frequency data, the computed quantile adjustment value, and the updated cumulative rank value of the identified quantile.

    Cutoff value optimization for bias mitigating machine learning training system with multi-class target

    公开(公告)号:US12093826B2

    公开(公告)日:2024-09-17

    申请号:US18444906

    申请日:2024-02-19

    CPC classification number: G06N3/08 G06N5/022

    Abstract: A computing device trains a fair prediction model while defining an optimal event cutoff value. (A) A prediction model is trained with observation vectors. (B) The prediction model is executed to define a predicted target variable value and a probability associated with an accuracy of the predicted target variable value. (C) A conditional moments matrix is computed based on fairness constraints, the predicted target variable value, and the sensitive attribute variable value of each observation vector. The predicted target variable value has a predefined target event value only when the probability is greater than a predefined event cutoff value. (D) (A) through (C) are repeated. (E) An updated value is computed for the predefined event cutoff value. (F) (A) through (E) are repeated. An optimal event cutoff value is defined from the predefined event cutoff values used when repeating (A) through (E). The optimal value and prediction model are output.

    Advanced training of machine-learning models usable in control systems and other systems

    公开(公告)号:US10832087B1

    公开(公告)日:2020-11-10

    申请号:US16921417

    申请日:2020-07-06

    Abstract: Machine-learning models (MLM) can be configured more rapidly and accurately according to some examples. For example, a system can receive a first training dataset that includes (i) independent-variable values corresponding to independent variables and (ii) dependent-variable values corresponding to a dependent variable that is influenced by the independent variables. The independent-variable values can include nonlinear-variable values corresponding to at least one nonlinear independent variable. The system can then determine cluster assignments for the nonlinear-variable values, generate a second training dataset based on the cluster assignments, and train a model based on the second training dataset. The trained machine-learning model may then be used in various applications, such as control-system applications.

    ANALYTIC SYSTEM FOR MACHINE LEARNING PREDICTION MODEL SELECTION

    公开(公告)号:US20190258904A1

    公开(公告)日:2019-08-22

    申请号:US16059241

    申请日:2018-08-09

    Abstract: An assessment dataset is selected from an input dataset using a first stratified sampling process based on a value of an event assessment variable. A remainder of the input dataset is allocated to a training/validation dataset that is partitioned into an oversampled training/validation dataset using an oversampling process based on a predefined value of the event assessment variable. A validation sample is selected from the oversampled training/validation dataset using a second stratified sampling process based on the value of the event assessment variable. A training sample is selected from the oversampled training/validation dataset using the second stratified sampling process based on the value of the event assessment variable. The validation sample and the training sample are mutually exclusive. A predictive type model is trained using the selected training sample. A plurality of predictive type models are trained, validated, and scored using the samples to select a best predictive model.

    Analytic system for fast quantile computation with improved memory consumption strategy

    公开(公告)号:US10311128B2

    公开(公告)日:2019-06-04

    申请号:US16140931

    申请日:2018-09-25

    Abstract: A computing device computes a quantile value. A maximum value and a minimum value are computed for unsorted variable values to compute an upper bin value and a lower bin value for each bin of a plurality of bins. A frequency counter is computed for each bin by reading the unsorted variable values a second time. A bin number and a cumulative rank value are computed for a quantile. When an estimated memory usage value exceeds a predefined memory size constraint value, a subset of the plurality of bins are split into a plurality of bins, the frequency counter is recomputed for each bin, and the bin number and the cumulative rank value are recomputed. Frequency data is computed using the frequency counters. The quantile value is computed using the frequency data and the cumulative rank value for the quantile and output.

    Analytic system for fast quantile computation

    公开(公告)号:US10127192B1

    公开(公告)日:2018-11-13

    申请号:US15961373

    申请日:2018-04-24

    Abstract: A computing device computes a quantile value. A maximum value and a minimum value are computed for unsorted variable values. An upper bin value and a lower bin value are computed for each bin of a plurality of bins using the maximum and minimum values. A frequency counter is computed for each bin by reading the unsorted variable values a second time. Each frequency counter is a count of the variable values within a respective bin. A bin number and a cumulative rank value are computed for a quantile. The bin number identifies a specific within which a quantile value associated with the quantile is located. The cumulative rank value identifies a cumulative rank for the quantile value associated with the quantile. Frequency data is computed using the frequency counters. The quantile value is computed using the frequency data and the cumulative rank value for the quantile and output.

Patent Agency Ranking