-
公开(公告)号:US11106486B2
公开(公告)日:2021-08-31
申请号:US16952375
申请日:2020-11-19
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
公开(公告)号:US20180203720A1
公开(公告)日:2018-07-19
申请号:US15724973
申请日:2017-10-04
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
CPC classification number: G06F9/46 , G01D21/00 , G06F9/455 , G06F16/20 , G06F17/17 , G06F17/175 , G06F17/18 , G06F19/00 , H04L67/10
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
公开(公告)号:US10325008B2
公开(公告)日:2019-06-18
申请号:US15805774
申请日:2017-11-07
Applicant: SAS Institute Inc.
Inventor: Mahesh V. Joshi , Richard Potter , Jan Chvosta , Mark Roland Little
Abstract: Techniques for estimated compound probability distribution are described herein. Embodiments may include receiving a compound model specification comprising a frequency model and a severity model, the compound model specification including a model error comprising a frequency model error and a severity model error, and determining a number of frequency models and severity models to generate based on the received number of models to generate. Embodiments include generating a plurality of frequency models through perturbation of the frequency model according to the frequency model error, and generating a plurality of severity models through perturbation of the severity model according to the severity model error. Further, embodiments include dividing generation of a plurality of compound model samples among a plurality of distributed worker nodes, and receiving the plurality of compound model samples from the distributed worker nodes, and generating aggregate statistics from the plurality of compound model samples.
-
公开(公告)号:US10095660B2
公开(公告)日:2018-10-09
申请号:US14210361
申请日:2014-03-13
Applicant: SAS Institute Inc.
Inventor: Christian Macaro , Jan Chvosta , Mark Roland Little
Abstract: Various embodiments are generally directed to techniques for producing statistically correct and efficient combinations of multiple simulated posterior samples from MCMC and related Bayesian sampling schemes are described. One or more chains from a Bayesian posterior distribution of values may be generated. It may be determine whether the one or more chains have reached stationarity through parallel processing on a plurality of processing nodes. Based upon the determination, each of the one or more chains that have reached stationarity through parallel processing on the plurality of processing nodes may be sorted. The one or more sorted chains may be resampled through parallel processing on the plurality of processing nodes. The one or more resampled chains may be combined. Other embodiments are described and claimed.
-
公开(公告)号:US09798575B2
公开(公告)日:2017-10-24
申请号:US14270783
申请日:2014-05-06
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
CPC classification number: G06F9/46 , G01D21/00 , G06F9/455 , G06F17/17 , G06F17/175 , G06F17/18 , G06F17/30286 , G06F19/00 , H04L67/10
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
6.
公开(公告)号:US09672193B2
公开(公告)日:2017-06-06
申请号:US14217707
申请日:2014-03-18
Applicant: SAS Institute Inc.
Inventor: Christian Macaro , Jan Chvosta , Mark Roland Little
Abstract: Various embodiments are directed to techniques for selecting a subset of a set of simulated samples. A computer-program product including instructions to cause a computing device to order a plurality of UPDFs by UPDF value, wherein the plurality of UPDFs is associated with a chain of draws of a set of simulated samples, wherein each draw comprises multiple parameters and the UPDF values map to parameter values of the parameters; select a subset of the plurality of UPDFs based on the subset of the plurality of UPDFs having UPDF values within a range corresponding to a range of parameter values to include in a subset of the set of simulated samples; and transmit an indication of a draw comprising parameters having parameter values to include in the subset of the set of simulated samples, wherein the indication identifies the draw by associated UPDF. Other embodiments are described and claimed.
-
公开(公告)号:US11010451B2
公开(公告)日:2021-05-18
申请号:US14210259
申请日:2014-03-13
Applicant: SAS Institute Inc.
Inventor: Christian Macaro , Jan Chvosta , Mark Roland Little
Abstract: Techniques for automated Bayesian posterior sampling using Markov Chain Monte Carlo and related schemes are described. In an embodiment, one or more values in a stationarity phase for a system configured for Bayesian sampling may be initialized. Sampling may be performed in the stationarity phase based upon the one or more values to generate a plurality of samples. The plurality of samples may be evaluated based upon one or more stationarity criteria. The stationarity phase may be exited when the plurality of samples meets the one or more stationarity criteria. Other embodiments are described and claimed.
-
公开(公告)号:US10963292B2
公开(公告)日:2021-03-30
申请号:US16835854
申请日:2020-03-31
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
公开(公告)号:US10146741B2
公开(公告)日:2018-12-04
申请号:US14217858
申请日:2014-03-18
Applicant: SAS Institute Inc.
Inventor: Christian Macaro , Jan Chvosta , Mark Roland Little
Abstract: Various embodiments are directed to techniques for deriving a sample representation from a random sample. A computer-program product includes instructions to cause a first computing device to fit an empirical distribution function to a marginal probability distribution of a variable within a first sample portion of a random sample to derive a partial marginal probability distribution approximation, wherein the random sample is divided into multiple sample portions distributed among multiple computing devices; fit a first portion of a copula function to a multivariate probability distribution of the first sample portion, wherein the copula function is divided into multiple portions; and transmit an indication of a first likelihood contribution of the first sample portion to a coordinating device to cause a second computing device to fit a second portion of the copula function to a multivariate probability distribution of a second sample portion. Other embodiments are described and claimed.
-
公开(公告)号:US09665669B2
公开(公告)日:2017-05-30
申请号:US15197691
申请日:2016-06-29
Applicant: SAS Institute Inc.
Inventor: Mahesh V. Joshi , Richard Potter , Jan Chvosta , Mark Roland Little
CPC classification number: G06F17/18 , G06F17/5009 , G06F2217/10 , G06Q40/08
Abstract: Techniques for estimated compound probability distribution are described. An apparatus comprising a configuration component, perturbation component, sample generation controller, an aggregation component, a distribution fitting component, and statistics generation component. The configuration component operative to receive a compound model specification and candidate distribution definition. The perturbation component operative to generate a plurality of models from the compound model specification. The sample generation controller operative to initiate the generation of a plurality of compound model samples from each of the plurality of models. The distribution fitting component to generate parameter values for the candidate distribution definition based on the compound model samples. The statistics generation component to generate approximated aggregate statistics.
-
-
-
-
-
-
-
-
-