-
公开(公告)号:US20190146849A1
公开(公告)日:2019-05-16
申请号:US16193661
申请日:2018-11-16
Applicant: SAS Institute Inc.
Inventor: Michael James Leonard , Thiago Santos Quirino , Edward Tilden Blair , Jennifer Leigh Sloan Beeman , David Bruce Elsheimer
CPC classification number: G06F9/5072 , G06F8/41 , G06F9/5016 , G06F9/5083 , H04L67/10
Abstract: Timestamped data can be read in parallel by multiple grid-computing devices. The timestamped data, which can be partitioned into groups based on time series criteria, can be deterministically distributed across the multiple grid-computing devices based on the time series criteria. Each grid-computing device can sort and accumulate the timestamped data into a time series for each group it receives and then process the resultant time series based on a previously distributed script, which can be compiled at each grid-computing device, to generate output data. The grid-computing devices can write their output data in parallel. As a result, vast amounts of timestamped data can be easily analyzed across an easily expandable number of grid-computing devices with reduced computational expense.
-
公开(公告)号:US20190129887A1
公开(公告)日:2019-05-02
申请号:US16233400
申请日:2018-12-27
Applicant: SAS Institute Inc.
Inventor: Brian Payton Bowman , Jeff Ira Cleveland, III
CPC classification number: G06F16/137 , G06F3/0604 , G06F3/061 , G06F3/064 , G06F3/0643 , G06F3/0644 , G06F3/067 , G06F9/5072 , G06F9/5077 , G06F12/0292 , G06F16/1827 , G06F16/22 , G06F16/278 , G06F21/602 , G06F2212/1016 , G06F2212/1056 , G06F2212/154 , G06F2212/262 , G06F2212/263 , H05K999/99
Abstract: An apparatus includes a processor component to receive a node device identifier defining an ordering among multiple node devices and among multiple blocks of data distributed among the multiple node devices, and transmit a size of a first subset of the multiple blocks stored within the node device to a control device. In response to receiving instructions to receive a second subset from another node device, perform operations including: receive and store the second subset; group the blocks of data of the first and second subsets into multiple segments in an order that corresponds to the ordering among the multiple blocks, wherein each segment is sized to fit minimum and maximum sizes for transmission to storage device(s); transmit the multiple segments to the storage device(s); and relay multiple segment identifiers from the storage device(s) to the control device in an order corresponding to the ordering among the multiple segments.
-
公开(公告)号:US20190114302A1
公开(公告)日:2019-04-18
申请号:US16205424
申请日:2018-11-30
Applicant: SAS Institute Inc.
Inventor: Henry Gabriel Victor Bequet
IPC: G06F16/901 , H04L29/08 , G06F16/903
CPC classification number: G06F16/9014 , G06F16/90344 , H04L67/10
Abstract: An apparatus includes a processor to employ a neural network to interpret sketch input to identify an object token that represents a command to display either details of an object or a list of objects on a specified page of a GUI. In response to identifying the object token, the processor is caused to generate GUI instructions to perform the command, and employ the neural network to further interpret the sketch input to identify text specifying a page of the GUI on which to perform the command. In response to identifying the text specifying the page, the processor is caused to incorporate an indication of the page into the GUI instructions, augment a job flow definition with the GUI instructions, and store the job flow definition within a federated area in support of providing the GUI when the job flow of the job flow definition is performed.
-
公开(公告)号:US20190102676A1
公开(公告)日:2019-04-04
申请号:US16127716
申请日:2018-09-11
Applicant: SAS Institute Inc.
Inventor: Mohammad Reza Nazari , Afshin Orooiloov Jadid , Mustafa Kabul
CPC classification number: G06N3/08 , G06F16/24568 , G06F17/18 , G06K9/00496 , G06K9/6256 , G06K9/627 , G06K2209/19 , G06N3/006 , G06N3/04 , G06N3/0472 , G06N3/049 , G06N20/00
Abstract: Exemplary embodiments can maximize long-term value in a machine learning system. The system may employ an offline training process and an online training process. In the offline training process, an initial policy is learned to provide a warm start to the online training process. In the online training process, the system applies concurrent reinforcement learning across multiple environments, with the goal of learning efficient policies in real time from in-flight user data in one environment, and applying the learned policies to other environments. With the combination of offline training and online training, the system is able to improve initial performance through the warm start, while adapting to a changing context through concurrent reinforcement learning.
-
95.
公开(公告)号:US10248476B2
公开(公告)日:2019-04-02
申请号:US15986037
申请日:2018-05-22
Applicant: SAS Institute Inc.
Inventor: Douglas Allan Cairns
Abstract: Exemplary embodiments relate to the problem of determining measurements in a distributed computing environment in which observations relating to the measurements are distributed amongst two or more nodes. Each node, which stores a number of node-specific observations, makes available its observation count and a number of observation sketches. The observations are merged into an array, and the sketches from each node are combined into overall summary sketches representing a summary of the observations across all the nodes. The summary sketches may then be used to approximate the measurement. The described techniques allow for the computation of arbitrary measurements (i.e., measurements that are not predetermined and for whose calculation the environment is not preconfigured) in a grid computing environment with a technical advantage of having very few rounds of data communication (e.g., two or less) required between the nodes in the computing grid.
-
96.
公开(公告)号:US20190080253A1
公开(公告)日:2019-03-14
申请号:US15928363
申请日:2018-03-22
Applicant: SAS Institute Inc.
Abstract: A computing device provides a cluster connectivity graph presented on a display to summarize machine learning model performance. A classification value is predicted is predicted for a response variable value of each observation vector using a trained model. Observation vectors are divided into overlapping data slices that are separately clustered using the predicted classification value to define a set of clusters. A number of observations in each cluster is computed. An accuracy measure is computed for each cluster based on the predicted classification value. A number of overlapping observations between each pair of clusters is computed. The cluster connectivity graph includes a node for each cluster. A size of each node is determined from the computed number of observations. A fill-pattern of each node is determined from the computed accuracy measure. A connector line between each pair of nodes is determined from the computed number of overlapping observations.
-
公开(公告)号:US20190050446A1
公开(公告)日:2019-02-14
申请号:US16161339
申请日:2018-10-16
Applicant: SAS Institute Inc.
Inventor: Biruk Gebremariam
Abstract: A system provides analysis of distributed data and grouping of variables in support of analytics. Policy parameter values that define thresholds are received. A first computation of a cardinality value and of a number of observations having a non-missing value is requested for each variable of a plurality of variables included in the distributed data by each worker computing device. A number of observation vectors having the non-missing value and the cardinality value are computed by each worker computing device for each variable in response to the first computation request. Each respective worker computing device computes the number of observation vectors having the non-missing value and the cardinality value from a subset of the input dataset distributed to the respective worker computing device by reading each observation vector from the subset once. Each variable is assigned a category based on a comparison between computed values and the policy parameter values.
-
公开(公告)号:US20190034766A1
公开(公告)日:2019-01-31
申请号:US16108293
申请日:2018-08-22
Applicant: SAS Institute Inc.
Inventor: Xu Chen , Saratendu Sethi
Abstract: A computing device automatically classifies an observation vector. (a) A converged classification matrix is computed that defines a label probability for each observation vector. (b) The value of the target variable associated with a maximum label probability value is selected for each observation vector. Each observation vector is assigned to a cluster. A distance value is computed between observation vectors assigned to the same cluster. An average distance value is computed for each observation vector. A predefined number of observation vectors are selected that have minimum values for the average distance value. The supervised data is updated to include the selected observation vectors with the value of the target variable selected in (b). The selected observation vectors are removed from the unlabeled subset. (a) and (b) are repeated. The value of the target variable for each observation vector is output to a labeled dataset.
-
公开(公告)号:US10192001B2
公开(公告)日:2019-01-29
申请号:US15725026
申请日:2017-10-04
Applicant: SAS Institute Inc. , North Carolina State University
Inventor: Samuel Paul Leeman-Munk , Saratendu Sethi , Christopher Graham Healey , Shaoliang Nie , Kalpesh Padia , Ravinder Devarajan , David James Caira , Jordan Riley Benson , James Allen Cox , Lawrence E. Lewis , Mustafa Onur Kabul
Abstract: Convolutional neural networks can be visualized. For example, a graphical user interface (GUI) can include a matrix of symbols indicating feature-map values that represent a likelihood of a particular feature being present or absent in an input to a convolutional neural network. The GUI can also include a node-link diagram representing a feed forward neural network that forms part of the convolutional neural network. The node-link diagram can include a first row of symbols representing an input layer to the feed forward neural network, a second row of symbols representing a hidden layer of the feed forward neural network, and a third row of symbols representing an output layer of the feed forward neural network. Lines between the rows of symbols can represent connections between nodes in the input layer, the hidden layer, and the output layer of the feed forward neural network.
-
100.
公开(公告)号:US10169709B2
公开(公告)日:2019-01-01
申请号:US15788238
申请日:2017-10-19
Applicant: SAS Institute Inc.
Inventor: Kalyan Joshi , Nitzi Roehl , Yung-Hsin (Alex) Chien
IPC: G06N5/04
Abstract: Data sets for a three-stage predictor can be automatically determined. For example, multiple time series can be filtered to identify a subset of time series that have time durations that exceed a preset time duration. Whether a time series of the subset of time series includes a time period with inactivity can be determined. Whether the time series exhibits a repetitive characteristic can be determined based on whether the time series has a pattern that repeats over a predetermined time period. Whether the time series includes a magnitude spike with a value above a preset magnitude can be determined. If the time series (i) lacks the time period with inactivity, (ii) exhibits the repetitive characteristic, and (iii) has the magnitude spike with the value above the preset magnitude threshold, the time series can be included in a data set for use with the three-stage predictor.
-
-
-
-
-
-
-
-
-