Abstract:
A computing device compresses a gradient boosting tree predictive model. A gradient boosting tree predictive model is trained using a plurality of observation vectors. Each observation vector includes an explanatory variable value of an explanatory variable and a response variable value for a response variable. The gradient boosting tree predictive type model is trained to predict the response variable value of each observation vector based on a respective explanatory variable value of each observation vector. The trained gradient boosting tree predictive model is compressed using a compression model with a predefined penalty constant value and with a predefined array of coefficients to reduce a number of trees of the trained gradient boosting tree predictive model. The compression model minimizes a sparsity norm loss function. The compressed, trained gradient boosting tree predictive model is output for predicting a new response variable value from a new observation vector.
Abstract:
A computing device trains models for streaming classification. A baseline penalty value is computed that is inversely proportional to a square of a maximum explanatory variable value. A set of penalty values is computed based on the baseline penalty value. For each penalty value of the set of penalty values, a classification type model is trained using the respective penalty value and the observation vectors to compute parameters that define a trained model, the classification type model is validated using the respective penalty value and the observation vectors to compute a validation criterion value that quantifies a validation error, and the validation criterion value, the respective penalty value, and the parameters that define a trained model are stored to the computer-readable medium. The classification type model is trained to predict the response variable value of each observation vector based on the respective explanatory variable value of each observation vector.
Abstract:
Systems and methods for conflict resolution and stabilizing cut generation in a mixed integer linear program (MILP) solver are disclosed. One disclosed method includes receiving a mixed integer linear problem (MILP), the MILP having a root node and one or more global bounds; pre-processing the MILP, the MILP being associated with nodes; establishing a first threshold for a learning phase branch-and-cut process; performing, by one or more processors, the learning phase branch-and-cut process for nodes associated with the MILP, wherein performing the learning phase branch-and-cut process includes: evaluating the nodes associated with the MILP, collecting conflict information about the MILP, and determining whether the first threshold has been reached; responsive to reaching the first threshold, removing all of the nodes and restoring a root node of the MILP; and solving, with the one or more processors, the MILP using the restored root node and the collected conflict information.
Abstract:
A system, method, and computer-program product includes receiving, by a controller node, a request to execute a client process associated with a first programming language and a plurality of threads; launching, by the controller node, a plurality of multi-language worker processes based on a number of threads associated with the client process; and instructing, by the controller node, the plurality of multi-language worker processes to execute the plurality of threads associated with the client process.
Abstract:
A system, method, and computer-program product includes selecting, by a controller node, a plurality of hyperparameter search points from a hyperparameter search space; instructing, by the controller node, one or more worker nodes to concurrently train a plurality of machine learning models for a target number of epochs using the plurality of hyperparameter search points; receiving, from the one or more worker nodes, a plurality of performance metrics that measure a performance of the plurality of machine learning models during the target number of epochs; and removing, by the controller node, one or more underperforming hyperparameter search points from the plurality of hyperparameter search points according to a pre-defined performance metric ranking criterion associated with the plurality of performance metrics.
Abstract:
Tuned hyperparameter values are determined for training a machine learning model. When a selected hyperparameter configuration does not satisfy a linear constraint, if a projection of the selected hyperparameter configuration is included in a first cache that stores previously computed projections is determined. When the projection is included in the first cache, the projection is extracted from the first cache using the selected hyperparameter configuration, and the selected hyperparameter configuration is replaced with the extracted projection in the plurality of hyperparameter configurations. When the projection is not included in the first cache, a projection computation for the selected hyperparameter configuration is assigned to a session. A computed projection is received from the session for the selected hyperparameter configuration. The computed projection and the selected hyperparameter configuration are stored to the first cache, and the selected hyperparameter configuration is replaced with the computed projection.
Abstract:
A computing device determines an optimal number of threads for a computer task. Execution of a computing task is controlled in a computing environment based on each task configuration included in a plurality of task configurations to determine an execution runtime value for each task configuration. An optimal number of threads value is determined for each set of task configurations having common values for a task parameter value, a dataset indicator, and a hardware indicator. The optimal number of threads value is an extremum value of an execution parameter value as a function of a number of threads value. A dataset parameter value is determined for a dataset. A hardware parameter value is determined as a characteristic of each distinct executing computing device in the computing environment. The optimal number of threads value for each set of task configurations is stored in a performance dataset in association with the common values.
Abstract:
A computer solves a nonlinear optimization problem. An optimality check is performed for a current solution to an objective function that is a nonlinear equation with constraint functions on decision variables. When the performed optimality check indicates that the current solution is not an optimal solution, a barrier parameter value is updated, and a Lagrange multiplier value is updated for each constraint function based on a result of a complementarity slackness test. The current solution to the objective function is updated using a search direction vector determined by solving a primal-dual linear system that includes a dual variable for each constraint function and a step length value determined for each decision variable and for each dual variable. The operations are repeated until the optimality check indicates that the current solution is the optimal solution or a predefined number of iterations has been performed.
Abstract:
A computing device compresses a gradient boosting tree predictive model. A gradient boosting tree predictive model is trained using a plurality of observation vectors. Each observation vector includes an explanatory variable value of an explanatory variable and a response variable value for a response variable. The gradient boosting tree predictive type model is trained to predict the response variable value of each observation vector based on a respective explanatory variable value of each observation vector. The trained gradient boosting tree predictive model is compressed using a compression model with a predefined penalty constant value and with a predefined array of coefficients to reduce a number of trees of the trained gradient boosting tree predictive model. The compression model minimizes a sparsity norm loss function. The compressed, trained gradient boosting tree predictive model is output for predicting a new response variable value from a new observation vector.
Abstract:
A power method can be enhanced. For example, an electronic communication indicating a job to be performed can be received. A best rank-1 approximation of a matrix associated with the job can be determined using the power method. Each iteration of the power method can include determining a point that lies on a line passing through (i) a first value for a first singular vector from an immediately prior iteration of the power method; and (ii) a second value for the first singular vector from another prior iteration of the power method. Each iteration of the power method can also include determining, by performing the power method using the point, a current value for the first singular vector and a current value for a second singular vector for a current iteration of the power method. The job can then be performed using the best rank-1 approximation of the matrix.