-
1.
公开(公告)号:US20190095400A1
公开(公告)日:2019-03-28
申请号:US16030142
申请日:2018-07-09
Applicant: SAS Institute Inc.
Inventor: Hansi Jiang , Wenhao Hu , Haoyu Wang , Deovrat Vijay Kakde , Arin Chaudhuri
Abstract: A Gaussian similarity matrix is computed between observation vectors. An inverse Gaussian similarity matrix is computed from the Gaussian similarity matrix. A row sum vector is computed that includes a row sum value computed from each row of the inverse Gaussian similarity matrix. (a) A new observation vector is selected. (b) An acceptance value is computed for the new observation vector using the set of boundary support vectors, the row sum vector, and the new observation vector. (c) (a) and (b) are repeated when the computed acceptance value is less than or equal to zero. (d) An incremental vector is computed from the inverse Gaussian similarity matrix and the new observation vector when the computed acceptance value is greater than zero. (e) the selected new observation vector is output as an outlier observation vector when a maximum value of the incremental vector is less than a first predefined tolerance value.
-
公开(公告)号:US09830558B1
公开(公告)日:2017-11-28
申请号:US15185277
申请日:2016-06-17
Applicant: SAS Institute Inc.
Inventor: Arin Chaudhuri , Deovrat Vijay Kakde , Maria Jahja , Wei Xiao , Seung Hyun Kong , Hansi Jiang , Sergiy Peredriy
CPC classification number: G06N99/005 , G06F17/30539 , H04L67/02
Abstract: A computing device determines an SVDD to identify an outlier in a dataset. First and second sets of observation vectors of a predefined sample size are randomly selected from a training dataset. First and second optimal values are computed using the first and second observation vectors to define a first set of support vectors and a second set of support vectors. A third optimal value is computed using the first set of support vectors updated to include the second set of support vectors to define a third set of support vectors. Whether or not a stop condition is satisfied is determined by comparing a computed value to a stop criterion. When the stop condition is not satisfied, the first set of support vectors is defined as the third set of support vectors, and operations are repeated until the stop condition is satisfied. The third set of support vectors is output.
-
公开(公告)号:US11036981B1
公开(公告)日:2021-06-15
申请号:US17167633
申请日:2021-02-04
Applicant: SAS Institute Inc.
Inventor: Yuwei Liao , Anya Mary McGuirk , Byron Davis Biggs , Arin Chaudhuri , Allen Joseph Langlois , Vincent L. Deters
Abstract: A computing system determines if an event has occurred. A first window is defined that includes a subset of a plurality of observation vectors modeled as an output of an autoregressive causal system. A magnitude adjustment vector is computed from a mean computed for a matrix of magnitude values that includes a column for each window of a plurality of windows. The first window is stored in a next column of the matrix of magnitude values. Each cell of the matrix of magnitude values includes an estimated power spectrum value for a respective window and a respective frequency. A second matrix of magnitude values is updated using the magnitude adjustment vector. Each cell of the second matrix of magnitude values includes an adjusted power spectrum value for the respective window and the respective frequency. A peak is detected from the next column of the second matrix of magnitude values.
-
公开(公告)号:US20190042977A1
公开(公告)日:2019-02-07
申请号:US15887037
申请日:2018-02-02
Applicant: SAS Institute Inc.
Inventor: Arin Chaudhuri , Deovrat Vijay Kakde , Carol Wagih Sadek , Seung Hyun Kong , Laura Lucia Gonzalez
Abstract: A computing device employs machine learning and determines a bandwidth parameter value for a support vector data description (SVDD). A mean pairwise distance value is computed between observation vectors. A scaling factor value is computed based on a number of the plurality of observation vectors and a predefined tolerance value. A Gaussian bandwidth parameter value is computed using the computed mean pairwise distance value and the computed scaling factor value. An optimal value of an objective function is computed that includes a Gaussian kernel function that uses the computed Gaussian bandwidth parameter value. The objective function defines a SVDD model using the plurality of observation vectors to define a set of support vectors. The computed Gaussian bandwidth parameter value and the defined a set of support vectors are output for determining if a new observation vector is an outlier.
-
公开(公告)号:US10157319B2
公开(公告)日:2018-12-18
申请号:US15894002
申请日:2018-02-12
Applicant: SAS Institute Inc.
Inventor: Wei Xiao , Jorge Manuel Gomes da Silva , Saba Emrani , Arin Chaudhuri
Abstract: A computing device detects an abnormal observation vector using a principal components decomposition. The principal components decomposition includes a sparse noise vector st computed for the observation vector that includes a plurality of values, wherein each value is associated with a variable to define a plurality of variables. The sparse noise vector st has a dimension equal to m a number of the plurality of variables. A zero counter time series value ĉt is computed using ĉt=Σi=1mst[i]. A probability value for ĉt is computed using p=Σi=ĉt+1m+1Hc[i]/Σi=0m+1Hc[i], where Hc[i] includes a count of a number of times each value of ĉt occurred for previous observation vectors. The probability value is compared with a predefined abnormal observation probability value. An abnormal observation indicator is set when the probability value indicates the observation vector is abnormal. The observation vector is output when the probability value indicates the observation vector is abnormal.
-
公开(公告)号:US20170323221A1
公开(公告)日:2017-11-09
申请号:US15185277
申请日:2016-06-17
Applicant: SAS Institute Inc.
Inventor: Arin Chaudhuri , Deovrat Vijay Kakde , Maria Jahja , Wei Xiao , Seung Hyun Kong , Hansi Jiang , Sergiy Peredriy
CPC classification number: G06N99/005 , G06F17/30539 , H04L67/02
Abstract: A computing device determines an SVDD to identify an outlier in a dataset. First and second sets of observation vectors of a predefined sample size are randomly selected from a training dataset. First and second optimal values are computed using the first and second observation vectors to define a first set of support vectors and a second set of support vectors. A third optimal value is computed using the first set of support vectors updated to include the second set of support vectors to define a third set of support vectors. Whether or not a stop condition is satisfied is determined by comparing a computed value to a stop criterion. When the stop condition is not satisfied, the first set of support vectors is defined as the third set of support vectors, and operations are repeated until the stop condition is satisfied. The third set of support vectors is output.
-
公开(公告)号:US10984075B1
公开(公告)日:2021-04-20
申请号:US17069293
申请日:2020-10-13
Applicant: SAS Institute Inc.
Inventor: Yu Liang , Arin Chaudhuri , Haoyu Wang
Abstract: A computer transforms high-dimensional data into low-dimensional data. A distance is computed between a selected observation vector and each observation vector of a plurality of observation vectors, a nearest neighbors are selected using the computed distances, and a first sigmoid function is applied to compute a distance similarity value between the selected observation vector and each of the selected nearest neighbors where each of the computed distance similarity values is added to a first matrix. The process is repeated with each observation vector of the plurality of observation vectors as the selected observation vector. An optimization method is executed with an initial matrix, the first matrix, and a gradient of a second sigmoid function that computes a second distance similarity value between the selected observation vector and each of the nearest neighbors to transform each observation vector of the plurality of observation vectors into the low-dimensional space.
-
8.
公开(公告)号:US10482353B2
公开(公告)日:2019-11-19
申请号:US16055336
申请日:2018-08-06
Applicant: SAS Institute Inc.
Inventor: Yuwei Liao , Deovrat Vijay Kakde , Arin Chaudhuri , Hansi Jiang , Carol Wagih Sadek , Seung Hyun Kong
Abstract: A computing device determines a bandwidth parameter value for outlier detection or data classification. A mean pairwise distance value is computed between observation vectors. A tolerance value is computed based on a number of observation vectors. A scaling factor value is computed based on a number of observation vectors and the tolerance value. A Gaussian bandwidth parameter value is computed using the mean pairwise distance value and the scaling factor value. An optimal value of an objective function is computed that includes a Gaussian kernel function that uses the computed Gaussian bandwidth parameter value. The objective function defines a support vector data description model using the observation vectors to define a set of support vectors. The Gaussian bandwidth parameter value and the set of support vectors are output for determining if a new observation vector is an outlier or for classifying the new observation vector.
-
公开(公告)号:US10303954B2
公开(公告)日:2019-05-28
申请号:US15893959
申请日:2018-02-12
Applicant: SAS Institute Inc.
Inventor: Wei Xiao , Jorge Manuel Gomes da Silva , Saba Emrani , Arin Chaudhuri
Abstract: A computing device updates an estimate of one or more principal components for a next observation vector. An initial observation matrix is defined with first observation vectors. A number of the first observation vectors is a predefined window length. Each observation vector of the first observation vectors includes a plurality of values. A principal components decomposition is computed using the initial observation matrix. The principal components decomposition includes a sparse noise vector s, a first singular value decomposition vector U, and a second singular value decomposition vector v for each observation vector of the first observation vectors. A rank r is determined based on the principal components decomposition. A next principal components decomposition is computed for a next observation vector using the determined rank r. The next principal components decomposition is output for the next observation vector and monitored to determine a status of a physical object.
-
10.
公开(公告)号:US20180239740A1
公开(公告)日:2018-08-23
申请号:US15894002
申请日:2018-02-12
Applicant: SAS Institute Inc.
Inventor: Wei Xiao , Jorge Manuel Gomes da Silva , Saba Emrani , Arin Chaudhuri
CPC classification number: G06K9/00771 , G06F9/30036 , G06F17/16 , G06F17/18 , G06K9/481 , G06K9/623 , G06K9/6232 , G06K9/6247 , G06K9/6249 , G06K2009/3291
Abstract: A computing device detects an abnormal observation vector using a principal components decomposition. The principal components decomposition includes a sparse noise vector st computed for the observation vector that includes a plurality of values, wherein each value is associated with a variable to define a plurality of variables. The sparse noise vector st has a dimension equal to m a number of the plurality of variables. A zero counter time series value ĉt is computed using ĉt=Σi=1mst[i]. A probability value for ĉt is computed using p=Σi=ĉt+1m+1Hc[i]/Σi=0m+1Hc[i], where Hc[i] includes a count of a number of times each value of ĉt occurred for previous observation vectors. The probability value is compared with a predefined abnormal observation probability value. An abnormal observation indicator is set when the probability value indicates the observation vector is abnormal. The observation vector is output when the probability value indicates the observation vector is abnormal.
-
-
-
-
-
-
-
-
-