-
公开(公告)号:US10963292B2
公开(公告)日:2021-03-30
申请号:US16835854
申请日:2020-03-31
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
2.
公开(公告)号:US20170024242A9
公开(公告)日:2017-01-26
申请号:US14270783
申请日:2014-05-06
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
CPC classification number: G06F9/46 , G01D21/00 , G06F9/455 , G06F17/17 , G06F17/175 , G06F17/18 , G06F17/30286 , G06F19/00 , H04L67/10
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
Abstract translation: 描述了管理用于统计测试的虚拟类的技术。 设备可以包括模拟数据组件以产生用于统计测试的模拟数据,基于参数向量的统计测试的统计量来跟随概率分布,统计仿真器组件来模拟来自具有分布式的模拟数据的参数向量的统计 包括多个节点的计算系统,每个节点具有能够执行多个线程的一个或多个处理器,所述仿真通过分布在所述分布式计算系统的所述多个节点上的所述模拟数据的部分进行发生,以及分布式控制引擎,用于控制所述多个节点上的任务执行 分布式计算系统的每个节点上的模拟数据的分布部分具有布置成在分布式计算系统的各节点之间协调任务和子任务操作的虚拟软件类。 描述和要求保护其他实施例。
-
公开(公告)号:US20250053615A1
公开(公告)日:2025-02-13
申请号:US18905480
申请日:2024-10-03
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Tao Huang , Jan Chvosta
IPC: G06F17/18
Abstract: A computing device learns a directed acyclic graph (DAG). (A) A target variable is defined from variables based on a topological order vector and a first index. (B) Input variables are defined from the variables based on the topological order vector and a second index. (C) A machine learning model is trained with observation vectors using the target variable and the input variables. (D) The machine learning model is executed to compute a loss value. (E) The second index is incremented. (F) (B) through (E) are repeated a first plurality of times. (G) The first index is incremented. (H) (A) through (G) are repeated a second plurality of times. A parent set is determined for each variable based on a comparison between the loss value computed each repetition of (D). The parent set is output for each variable to describe the DAG that defines a hierarchical relationship between the variables.
-
公开(公告)号:US11120032B1
公开(公告)日:2021-09-14
申请号:US17209752
申请日:2021-03-23
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Xunlei Wu , Jan Chvosta
IPC: G06F16/2458 , G06F16/2453
Abstract: Computing resources consumed in performing computerized sequence-mining can be reduced by implementing some examples of the present disclosure. In one example, a system can determine weights for data entries in a data set and then select a group of data entries from the data set based on the weights. Next, the system can determine a group of k-length sequences present in the selected group of data entries by applying a shuffling algorithm. The system can then determine frequencies corresponding to the group of k-length sequences and select candidate sequences from among the group of k-length sequences based on the frequencies thereof. Next, the system can determine support values corresponding to the candidate sequences and then select output sequences from among the candidate sequences based on the support values thereof. The system may then transmit an output signal indicating the selected output sequences an electronic device.
-
公开(公告)号:US11106486B2
公开(公告)日:2021-08-31
申请号:US16952375
申请日:2020-11-19
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
公开(公告)号:US20180203720A1
公开(公告)日:2018-07-19
申请号:US15724973
申请日:2017-10-04
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Mark Roland Little
CPC classification number: G06F9/46 , G01D21/00 , G06F9/455 , G06F16/20 , G06F17/17 , G06F17/175 , G06F17/18 , G06F19/00 , H04L67/10
Abstract: Techniques to manage virtual classes for statistical tests are described. An apparatus may comprise a simulated data component to generate simulated data for a statistical test, statistics of the statistical test based on parameter vectors to follow a probability distribution, a statistic simulator component to simulate statistics for the parameter vectors from the simulated data with a distributed computing system comprising multiple nodes each having one or more processors capable of executing multiple threads, the simulation to occur by distribution of portions of the simulated data across the multiple nodes of the distributed computing system, and a distributed control engine to control task execution on the distributed portions of the simulated data on each node of the distributed computing system with a virtual software class arranged to coordinate task and sub-task operations across the nodes of the distributed computing system. Other embodiments are described and claimed.
-
公开(公告)号:US20250045263A1
公开(公告)日:2025-02-06
申请号:US18538066
申请日:2023-12-13
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Tao Huang , Jan Chvosta
Abstract: A computing device learns a best topological order vector for a plurality of variables. (A) A topological order vector is defined. (B) A target variable and zero or more input variables are defined based on the topological order vector. (C) A machine learning model is trained with observation vectors using values of the target variable and the zero or more input variables. (D) The machine learning model is executed with second observation vectors using the values of the target variable and the zero or more input variables to compute a loss value. (E) (A) through (D) are repeated a plurality of times. Each topological order vector defined in (A) is unique in comparison to other topological order vectors defined in (A). The best topological order vector is determined based on a comparison between the loss values computed for each topological order vector in (D).
-
公开(公告)号:US20240346289A1
公开(公告)日:2024-10-17
申请号:US18530798
申请日:2023-12-06
Applicant: SAS Institute Inc.
Inventor: Sylvie Tchumtchoua Kabisa , Xilong Chen , Gunce Eryuruk Walton , David Bruce Elsheimer , Ming-Chun Chang
Abstract: A point estimate value for an individual is computed using a Bayesian neural network model (BNN) by training a first BNN model that computes a weight mean value, a weight standard deviation value, a bias mean value, and a bias standard deviation value for each neuron of a plurality of neurons using observations. A plurality of BNN models is instantiated using the first BNN model. Instantiating each BNN model of the plurality of BNN models includes computing, for each neuron, a weight value using the weight mean value, the weight standard deviation value, and a weight random draw and a bias value using the bias mean value, the bias standard deviation value, and a bias random draw. Each instantiated BNN model is executed with the observations to compute a statistical parameter value for each observation vector of the observations. The point estimate value is computed from the statistical parameter value.
-
公开(公告)号:US20240346284A1
公开(公告)日:2024-10-17
申请号:US18529014
申请日:2023-12-05
Applicant: SAS Institute Inc.
Inventor: Sylvie Tchumtchoua Kabisa , Xilong Chen , Gunce Eryuruk Walton , David Bruce Elsheimer , Ming-Chun Chang
Abstract: A treatment model trained to compute an estimated treatment variable value for each observation vector of a plurality of observation vectors is executed. Each observation vector includes covariate variable values, a treatment variable value, and an outcome variable value. An outcome model trained to compute an estimated outcome value for each observation vector using the treatment variable value for each observation vector is executed. A standard error value associated with the outcome model is computed using a first variance value computed using the treatment variable value of the plurality of observation vectors, using a second variance value computed using the treatment variable value and the estimated treatment variable value of the plurality of observation vectors, and using a third variance value computed using the estimated outcome value of the plurality of observation vectors. The standard error value is output.
-
公开(公告)号:US12056207B1
公开(公告)日:2024-08-06
申请号:US18538070
申请日:2023-12-13
Applicant: SAS Institute Inc.
Inventor: Xilong Chen , Tao Huang , Jan Chvosta
IPC: G06F17/18
CPC classification number: G06F17/18
Abstract: A computing device learns a best topological order vector of a plurality of variables. A target variable and zero or more input variables are defined. (A) A machine learning model is trained with observation vectors using the target variable and the zero or more input variables. (B) The machine learning model is executed to compute an equation loss value. (C) The equation loss value is stored with the identifier. (D) The identifier is incremented. (E) (A) through (D) are repeated a plurality of times. (F) A topological order vector is defined. (G) A loss value is computed from a subset of the stored equation loss values based on the topological order vector. (F) through (G) are repeated for each unique permutation of the topological order vector. A best topological order vector is determined based on a comparison between the loss value computed for each topological order vector in (G).
-
-
-
-
-
-
-
-
-