Abstract:
A computing device selects new test configurations for testing software. (A) First test configurations are generated using a random seed value. (B) Software under test is executed with the first test configurations to generate a test result for each. (C) Second test configurations are generated from the first test configurations and the test results generated for each. (D) The software under test is executed with the second test configurations to generate the test result for each. (E) When a restart is triggered based on a distance metric value computed between the second test configurations, a next random seed value is selected as the random seed value and (A) through (E) are repeated. (F) When the restart is not triggered, (C) through (F) are repeated until a stop criterion is satisfied. (G) When the stop criterion is satisfied, the test result is output for each test configuration.
Abstract:
A treatment model that is a first neural network is trained to optimize a treatment loss function based on a treatment variable t using a plurality of observation vectors by regressing t on x(1),z. The trained treatment model is executed to compute an estimated treatment variable value {circumflex over (t)}i for each observation vector. An outcome model that is a second neural network is trained to optimize an outcome loss function by regressing y on x(2) and an estimated treatment variable t. The trained outcome model is executed to compute an estimated first unknown function value {circumflex over (α)}(xi(2)) and an estimated second unknown function value {circumflex over (β)}(xi(2)) for each observation vector. An influence function value is computed for a parameter of interest using {circumflex over (α)}(xi(2)) and {circumflex over (β)}(xi(2)). A value is computed for the predefined parameter of interest using the computed influence function value.
Abstract:
In some examples, computing devices can partition timestamped data into groups. The computing devices can then distribute the timestamped data based on the groups. The computing devices can also obtain copies of a script configured to process the timestamped data, such that each computing device receives a copy of the script. The computing devices can determine one or more code segments associated with the groups based on content of the script. The one or more code segments can be in one or more programming languages that are different than a programming language of the script. The computing devices can then run the copies of the script to process the timestamped data within the groups. This may involve interacting with one or more job servers configured to run the one or more code segments associated with the groups.
Abstract:
A point estimate value for an individual is computed using a Bayesian neural network model (BNN) by training a first BNN model that computes a weight mean value, a weight standard deviation value, a bias mean value, and a bias standard deviation value for each neuron of a plurality of neurons using observations. A plurality of BNN models is instantiated using the first BNN model. Instantiating each BNN model of the plurality of BNN models includes computing, for each neuron, a weight value using the weight mean value, the weight standard deviation value, and a weight random draw and a bias value using the bias mean value, the bias standard deviation value, and a bias random draw. Each instantiated BNN model is executed with the observations to compute a statistical parameter value for each observation vector of the observations. The point estimate value is computed from the statistical parameter value.
Abstract:
A treatment model trained to compute an estimated treatment variable value for each observation vector of a plurality of observation vectors is executed. Each observation vector includes covariate variable values, a treatment variable value, and an outcome variable value. An outcome model trained to compute an estimated outcome value for each observation vector using the treatment variable value for each observation vector is executed. A standard error value associated with the outcome model is computed using a first variance value computed using the treatment variable value of the plurality of observation vectors, using a second variance value computed using the treatment variable value and the estimated treatment variable value of the plurality of observation vectors, and using a third variance value computed using the estimated outcome value of the plurality of observation vectors. The standard error value is output.
Abstract:
A computing device selects new test configurations for testing software. (A) First test configurations are generated using a random seed value. (B) Software under test is executed with the first test configurations to generate a test result for each. (C) Second test configurations are generated from the first test configurations and the test results generated for each. (D) The software under test is executed with the second test configurations to generate the test result for each. (E) When a restart is triggered based on a distance metric value computed between the second test configurations, a next random seed value is selected as the random seed value and (A) through (E) are repeated. (F) When the restart is not triggered, (C) through (F) are repeated until a stop criterion is satisfied. (G) When the stop criterion is satisfied, the test result is output for each test configuration.
Abstract:
Timestamped data can be read in parallel by multiple grid-computing devices. The timestamped data, which can be partitioned into groups based on time series criteria, can be deterministically distributed across the multiple grid-computing devices based on the time series criteria. Each grid-computing device can sort and accumulate the timestamped data into a time series for each group it receives and then process the resultant time series based on a previously distributed script, which can be compiled at each grid-computing device, to generate output data. The grid-computing devices can write their output data in parallel. As a result, vast amounts of timestamped data can be easily analyzed across an easily expandable number of grid-computing devices with reduced computational expense.
Abstract:
Machines can be controlled using advanced control systems. Such control systems may use an automated version of singular spectrum analysis to control a machine. For example, a control system can perform singular spectrum analysis on a time series by: generating a trajectory matrix from the time series, performing singular value decomposition on the trajectory matrix to determine elementary matrices and corresponding eigenvalues, and automatically categorizing the elementary matrices into groups. The elementary matrices can be automatically categorized into the groups by: generating a matrix of w-correlation values based on the eigenvalues, categorizing the w-correlation values into a predefined number of w-correlation sets, and forming the groups based on the predefined number of w-correlation sets. The control system can then determine component time-series based on the groups, and generate a predictive forecast using the component time-series. The control system can use the predictive forecast to control operation of the machine.
Abstract:
Systems and methods are included for adjusting a set of predicted future data points for a time series data set including a receiver for receiving a time series data set. One or more processors and one or more non-transitory computer readable storage mediums containing instructions may be utilized. A count series forecasting engine, utilizing the one or more processors, generates a set of counts corresponding to discrete values of the time series data set. An optimal discrete probability distribution for the set of counts is selected. A set of parameters are generated for the optimal discrete probability distribution. A statistical model is selected to generate a set of predicted future data points. The set of predicted future data points are adjusted using the generated set of parameters for the optimal discrete probability distribution in order to provide greater accuracy with respect to predictions of future data points.
Abstract:
Systems and methods for forecasting ratios in hierarchies are provided. Hierarchies can be formed that have components, including a numerator time series with values from input data, a denominator time series with values from input data, and a ratio time series of the numerator time series over the denominator time series. The components can be modeled to generate forecasted hierarchies. The forecasted hierarchies can be reconciled so that the forecasted hierarchies are statistically consistent throughout nodes of the forecasted hierarchies.