Invention Grant
- Patent Title: Framework for data extraction by examples
- Patent Title (中): 通过示例提取数据框架
-
Application No.: US14636664Application Date: 2015-03-03
-
Publication No.: US09542622B2Publication Date: 2017-01-10
- Inventor: Sumit Gulwani , Vu Minh Le
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agent Alin Corie; Sandy Swain; Micky Minhas
- Main IPC: G06K9/00
- IPC: G06K9/00 ; G06K9/62 ; G06F17/30 ; G06F17/24

Abstract:
Various technologies described herein pertain to controlling automated programming for extracting data from an input document. Examples indicative of the data to extract from the input document can be received. The examples can include highlighted regions on the input document. Moreover, the input document can be a semi-structured document (e.g. a text file, a log file, a word processor document, a semi-structured spreadsheet, a webpage, a fixed-layout document, an image file, etc.). Further, an extraction program for extracting the data from the input document can be synthesized based on the examples. The extraction program can be synthesized in a domain specific language (DSL) for a type of the input document. Moreover, the extraction program can be executed on the input document to extract an instance of an output data schema.
Public/Granted literature
- US20150254530A1 FRAMEWORK FOR DATA EXTRACTION BY EXAMPLES Public/Granted day:2015-09-10
Information query