-
公开(公告)号:US12277144B2
公开(公告)日:2025-04-15
申请号:US18221684
申请日:2023-07-13
Applicant: SAS INSTITUTE INC.
Inventor: Nancy Anne Rausch , Ruth Oluwadamilola Akintunde , Brant Nathan Kay
IPC: G06F40/284 , G06F16/242 , G06F16/28
Abstract: A computer-implemented system includes identifying a target hierarchical taxonomy comprising a plurality of distinct hierarchical taxonomy categories; extracting a plurality of distinct taxonomy tokens from the plurality of distinct hierarchical taxonomy categories; computing a taxonomy vector corpus based on the plurality of distinct taxonomy tokens; computing a plurality of distinct taxonomy clusters based on an input of the taxonomy vector corpus; constructing a hierarchical taxonomy classifier based on the plurality of distinct taxonomy clusters; converting a volume of unlabeled structured datasets to a plurality of distinct corpora of taxonomy-labeled structured datasets based on the hierarchical taxonomy classifier; and outputting at least one corpus of taxonomy-labeled structured datasets of the plurality of distinct corpora of taxonomy-labeled structured datasets based on an input of a data classification query.
-
公开(公告)号:US11341414B2
公开(公告)日:2022-05-24
申请号:US17165226
申请日:2021-02-02
Applicant: SAS Institute Inc.
Inventor: Nancy Anne Rausch , Roger Jay Barney , John P. Trawinski
Abstract: An apparatus includes processor(s) to: receive a request for a data catalog; in response to the request specifying a structural feature, analyze metadata of multiple data sets for an indication of including it, and to retrieve an indicated degree of certainty of detecting it for data sets including it; in response to the request specifying a contextual aspect, analyze context data of the multiple data sets for an indication of being subject to it, and to retrieve an indicated degree of certainty concerning it for data sets subject to it; selectively include each data set in the data catalog based on the request specifying a structural feature and/or a contextual aspect, and whether each data set meets what is specified; for each data set in the data catalog, generate a score indicative of the likelihood of meeting what is specified; and transmit the data catalog to the requesting device.
-
13.
公开(公告)号:US20210263949A1
公开(公告)日:2021-08-26
申请号:US17173308
申请日:2021-02-11
Applicant: SAS Institute Inc.
Inventor: James Allen Cox , Nancy Anne Rausch
Abstract: Computerized pipelines can transform input data into data structures compatible with models in some examples. In one such example, a system can obtain a first table that includes first data referencing a set of subjects. The system can then execute a sequence of processing operations on the first data in a particular order defined by a data-processing pipeline to modify an analysis table to include features associated with the set of subjects. Executing each respective processing operation in the sequence to generate the modified analysis table may involve: deriving a respective set of features from the first data by executing a respective feature-extraction operation on the first data; and adding the respective set of features to the analysis table. The system may then execute a predictive model on the modified analysis table for generating a predicted value based on the modified analysis table.
-
公开(公告)号:US20210158171A1
公开(公告)日:2021-05-27
申请号:US17165226
申请日:2021-02-02
Applicant: SAS Institute Inc.
Inventor: Nancy Anne Rausch , Roger Jay Barney , John P. Trawinski
Abstract: An apparatus includes processor(s) to: receive a request for a data catalog; in response to the request specifying a structural feature, analyze metadata of multiple data sets for an indication of including it, and to retrieve an indicated degree of certainty of detecting it for data sets including it; in response to the request specifying a contextual aspect, analyze context data of the multiple data sets for an indication of being subject to it, and to retrieve an indicated degree of certainty concerning it for data sets subject to it; selectively include each data set in the data catalog based on the request specifying a structural feature and/or a contextual aspect, and whether each data set meets what is specified; for each data set in the data catalog, generate a score indicative of the likelihood of meeting what is specified; and transmit the data catalog to the requesting device.
-
公开(公告)号:US20170153914A1
公开(公告)日:2017-06-01
申请号:US15431573
申请日:2017-02-13
Applicant: SAS Institute Inc.
Inventor: Nancy Anne Rausch , Ronald Agresta , Roger Jay Barney , Willem Abraham Hazejager
CPC classification number: G06F9/4843 , G06F9/4806 , G06F9/4881 , G06F17/30424 , G06F17/30445
Abstract: An apparatus may include a processor and storage to store instructions that cause the processor to perform operations including: generate a current data set model descriptive of a characteristic of a current data set; compare the current data set model to at least one previously generated data set model descriptive of a characteristic of a previously analyzed data set; in response to detection of a match within a similarity threshold: retrieve an indication from a correlation database of an action previously performed on a previously analyzed data set; select a computer language based on node data descriptive of characteristics of a node device execution environment; generate node instructions in the selected computer language and based on the current data set model to cause the node device to perform the previously performed action on a portion of the current data set; and transmit the node instructions to the node device.
-
-
-
-