Invention Grant
- Patent Title: Fast binary rule extraction for large scale text data
- Patent Title (中): 用于大规模文本数据的快速二进制规则提取
-
Application No.: US13624052Application Date: 2012-09-21
-
Publication No.: US08832015B2Publication Date: 2014-09-09
- Inventor: James Allen Cox , Zheng Zhao
- Applicant: SAS Institute Inc.
- Applicant Address: US NC Cary
- Assignee: SAS Institute Inc.
- Current Assignee: SAS Institute Inc.
- Current Assignee Address: US NC Cary
- Agency: Kilpatrick Townsend & Stockton LLP
- Main IPC: G06F17/00
- IPC: G06F17/00 ; G06N5/02

Abstract:
Systems and methods for identifying data files that have a common characteristic are provided. A plurality of data files including one or more data files having a common characteristic are received. A potential rule is generated by selecting key terms from a list that satisfy a term evaluation metric, and the potential rule is evaluated using a rule evaluation metric. The potential rule is added to the rule set if the rule evaluation metric is satisfied. Based upon the potential rule being added to the rule set, data files covered by the potential rule are removed from the plurality of data files. The potential rule generation and evaluation steps are repeated until a stopping criterion is met. After the stopping criterion has been met, the rule set is used to identify other data files having the common characteristic.
Public/Granted literature
- US20140089247A1 Fast Binary Rule Extraction for Large Scale Text Data Public/Granted day:2014-03-27
Information query