-
公开(公告)号:US11151463B2
公开(公告)日:2021-10-19
申请号:US17178798
申请日:2021-02-18
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Jorge Manuel Gomes da Silva , Brett Alan Wujek
Abstract: Data is classified using semi-supervised data. Sparse coefficients are computed using a decomposition of a Laplacian matrix. (B) Updated parameter values are computed for a dimensionality reduction method using the sparse coefficients, the Laplacian matrix, and a plurality of observation vectors. The updated parameter values include a robust estimator of a decomposition matrix determined from the decomposition of the Laplacian matrix. (B) is repeated until a convergence parameter value indicates the updated parameter values for the dimensionality reduction method have converged. A classification matrix is defined using the sparse coefficients and the robust estimator of the decomposition of the Laplacian matrix. The target variable value is determined for each observation vector based on the classification matrix. The target variable value is output for each observation vector of the plurality of unclassified observation vectors and is defined to represent a label for a respective unclassified observation vector.
-
公开(公告)号:US20210287116A1
公开(公告)日:2021-09-16
申请号:US17178798
申请日:2021-02-18
Applicant: SAS Institute Inc
Inventor: Xu Chen , Jorge Manuel Gomes da Silva , Brett Alan Wujek
Abstract: Data is classified using semi-supervised data. Sparse coefficients are computed using a decomposition of a Laplacian matrix. (B) Updated parameter values are computed for a dimensionality reduction method using the sparse coefficients, the Laplacian matrix, and a plurality of observation vectors. The updated parameter values include a robust estimator of a decomposition matrix determined from the decomposition of the Laplacian matrix. (B) is repeated until a convergence parameter value indicates the updated parameter values for the dimensionality reduction method have converged. A classification matrix is defined using the sparse coefficients and the robust estimator of the decomposition of the Laplacian matrix. The target variable value is determined for each observation vector based on the classification matrix. The target variable value is output for each observation vector of the plurality of unclassified observation vectors and is defined to represent a label for a respective unclassified observation vector.
-
公开(公告)号:US11093833B1
公开(公告)日:2021-08-17
申请号:US17081118
申请日:2020-10-27
Applicant: SAS Institute Inc.
Inventor: Steven Joseph Gardner , Joshua David Griffin , Yan Xu , Patrick Nathan Koch , Brett Alan Wujek , Oleg Borisovich Golovidov
Abstract: Tuned hyperparameter values are determined for training a machine learning model. When a selected hyperparameter configuration does not satisfy a linear constraint, if a projection of the selected hyperparameter configuration is included in a first cache that stores previously computed projections is determined. When the projection is included in the first cache, the projection is extracted from the first cache using the selected hyperparameter configuration, and the selected hyperparameter configuration is replaced with the extracted projection in the plurality of hyperparameter configurations. When the projection is not included in the first cache, a projection computation for the selected hyperparameter configuration is assigned to a session. A computed projection is received from the session for the selected hyperparameter configuration. The computed projection and the selected hyperparameter configuration are stored to the first cache, and the selected hyperparameter configuration is replaced with the computed projection.
-
公开(公告)号:US10956825B1
公开(公告)日:2021-03-23
申请号:US16904818
申请日:2020-06-18
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Jorge Manuel Gomes da Silva , Brett Alan Wujek
Abstract: Data is classified using semi-supervised data. A weight matrix is computed using a kernel function applied to observation vectors. A decomposition of the computed weight matrix is performed. A predefined number of eigenvectors is selected from the decomposed weight matrix to define a decomposition matrix. (A) A gradient value is computed as a function of the defined decomposition matrix, sparse coefficients, and a label vector. (B) A value of each coefficient of the sparse coefficients is updated based on the gradient value. (A) and (B) are repeated until a convergence parameter value indicates the sparse coefficients have converged. A classification matrix is defined using the converged sparse coefficients. The target variable value is determined and output for each observation vector based on the defined classification matrix to update the label vector and defined to represent the label for a respective unclassified observation vector.
-
公开(公告)号:US10929762B1
公开(公告)日:2021-02-23
申请号:US16940501
申请日:2020-07-28
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Brett Alan Wujek
Abstract: Data is classified using corrected semi-supervised data. Cluster centers are defined for unclassified observations. A class is determined for each cluster. A distance value is computed between a classified observation and each cluster center. When the class of the classified observation is not the class determined for the cluster center having a minimum distance, a first distance value is selected as the minimum distance, a second distance value is selected as the distance value computed to the cluster center having the class of the classified observation, a ratio value is computed between the second distance value and the first distance value, and the class of the classified observation is changed to the class determined for the cluster center having the minimum distance value when the computed ratio value satisfies a label correction threshold. A classification matrix is defined using corrected observations to determine the class for the unclassified observations.
-
公开(公告)号:US10832174B1
公开(公告)日:2020-11-10
申请号:US16816382
申请日:2020-03-12
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Brett Alan Wujek
Abstract: Data is classified using automatically selected hyperparameter values. (A) A first loss value is determined based on a converged classification matrix. (B) Each observation vector is assigned to a cluster using a clustering algorithm based on the converged classification matrix. (C) A predefined number of observation vectors is selected from each cluster. D) Classified observation vectors and unclassified observation vectors are updated based on the selections in (C) and (A) is repeated. (E) An entropy loss value is determined, wherein (A) to (E) are repeated for a plurality of different values of a kernel parameter value and a batch size value. (F) A second loss value is determined based on the converged classification matrix, a label matrix defined from the converged classification matrix, and a weight value. (L) (A) to (F) are repeated with a plurality of different values of the weight value until convergence is satisfied.
-
公开(公告)号:US20180240041A1
公开(公告)日:2018-08-23
申请号:US15822462
申请日:2017-11-27
Applicant: SAS Institute Inc.
Inventor: Patrick Nathan Koch , Brett Alan Wujek , Oleg Borisovich Golovidov , Steven Joseph Gardner , Joshua David Griffin , Scott Russell Pope , Yan Xu
Abstract: A computing device automatically selects hyperparameter values based on objective criteria to train a predictive model. Each session of a plurality of sessions executes training and scoring of a model type using an input dataset in parallel with other sessions of the plurality of sessions. Unique hyperparameter configurations are determined using a search method and assigned to each session. For each session of the plurality of sessions, training of a model of the model type is requested using a training dataset and the assigned hyperparameter configuration, scoring of the trained model using a validation dataset and the assigned hyperparameter configuration is requested to compute an objective function value, and the received objective function value and the assigned hyperparameter configuration are stored. A best hyperparameter configuration is identified based on an extreme value of the stored objective function values.
-
公开(公告)号:US11151480B1
公开(公告)日:2021-10-19
申请号:US17099846
申请日:2020-11-17
Applicant: SAS Institute Inc.
Inventor: Oleg Borisovich Golovidov , Brett Alan Wujek , Patrick Nathan Koch , Rajendra Prasad Singh
IPC: G06N20/10 , G06F3/0483 , G06F16/958 , G06F16/904
Abstract: A visualization is presented while tuning a machine learning model. A model tuning process writes tuning data to a history table. The model tuning process is repeatedly training and scoring a model type with different sets of values of hyperparameters defined based on the model type. An objective function value is computed for each set of values of the hyperparameters. Data stored in the history table is accessed and used to identify the hyperparameters. (A) A page template is selected from page templates that describe graphical objects presented in the display. (B) The page template is updated with the accessed data. (C) The display is updated using the updated page template. (D) At the end of a refresh time period, new data stored in the history table by the model tuning process is accessed. (E) (B) through (D) are repeated with the accessed data replaced with the accessed new data.
-
公开(公告)号:US20210264287A1
公开(公告)日:2021-08-26
申请号:US17081118
申请日:2020-10-27
Applicant: SAS Institute Inc.
Inventor: Steven Joseph Gardner , Joshua David Griffin , Yan Xu , Patrick Nathan Koch , Brett Alan Wujek , Oleg Borisovich Golovidov
Abstract: Tuned hyperparameter values are determined for training a machine learning model. When a selected hyperparameter configuration does not satisfy a linear constraint, if a projection of the selected hyperparameter configuration is included in a first cache that stores previously computed projections is determined. When the projection is included in the first cache, the projection is extracted from the first cache using the selected hyperparameter configuration, and the selected hyperparameter configuration is replaced with the extracted projection in the plurality of hyperparameter configurations. When the projection is not included in the first cache, a projection computation for the selected hyperparameter configuration is assigned to a session. A computed projection is received from the session for the selected hyperparameter configuration. The computed projection and the selected hyperparameter configuration are stored to the first cache, and the selected hyperparameter configuration is replaced with the computed projection.
-
公开(公告)号:US11010691B1
公开(公告)日:2021-05-18
申请号:US17093917
申请日:2020-11-10
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Jorge Manuel Gomes da Silva , Brett Alan Wujek
Abstract: Data is classified using semi-supervised data. A decomposition is performed to define a first decomposition matrix that includes first eigenvectors of a weight matrix, a second decomposition matrix that includes second eigenvectors of a transpose of the weight matrix, and a diagonal matrix that includes eigenvalues of the first eigenvectors. Eigenvectors are selected from the first eigenvectors to define a reduced decomposition matrix. A linear transformation matrix is computed as a function of the first decomposition matrix, the reduced decomposition matrix, the diagonal matrix, and a penalty matrix. When a rank of the linear transformation matrix is less than a number of rows of the penalty matrix, a classification matrix is computed by updating a gradient of a cost function. When the rank of the linear transformation matrix is equal to the number of rows of the penalty matrix, the classification matrix is computed using a dual formulation.
-
-
-
-
-
-
-
-
-