Invention Grant
- Patent Title: Automatic refinement of information extraction rules
- Patent Title (中): 自动细化信息提取规则
-
Application No.: US12788407Application Date: 2010-05-27
-
Publication No.: US08417709B2Publication Date: 2013-04-09
- Inventor: Laura Chiticariu , Bin Liu , Frederick R. Reiss
- Applicant: Laura Chiticariu , Bin Liu , Frederick R. Reiss
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Schmeiser, Olsen & Watts
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A method and system for automatically refining information extraction (IE) rules. A provenance graph for IE rules on a set of test documents is determined. The provenance graph indicates a sequence of evaluations of the IE rules that generates an output of each operator of the IE rules. Based on the provenance graph, high-level rule changes (HLCs) of the IE rules are determined. Low-level rule changes (LLCs) of the IE rules are determined to specify how to implement the HLCs. Each LLC specifies changing an operator's structure or inserting a new operator in between two operators. Based on how the LLCs affect the IE rules and previously received correct results of applying the rules on the test documents, a ranked list of the LLCs is determined. The IE rules are refined based on the ranked list.
Public/Granted literature
- US20110295854A1 AUTOMATIC REFINEMENT OF INFORMATION EXTRACTION RULES Public/Granted day:2011-12-01
Information query