Invention Grant
- Patent Title: Joining web data with spreadsheet data using examples
-
Application No.: US15633875Application Date: 2017-06-27
-
Publication No.: US10713429B2Publication Date: 2020-07-14
- Inventor: Rishabh Singh , Jeevana Priya Inala
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee Address: US WA Redmond
- Agency: Faegre Drinker Biddle & Reath LLP
- Main IPC: G06F40/18
- IPC: G06F40/18 ; H04L29/12 ; H04L29/08 ; G06F3/0482 ; G06F16/335 ; G06F16/9535 ; G06F16/25

Abstract:
Provided are methods and systems for joining semi-structured data from the web with relational data in a spreadsheet table using input-output examples. A first sub-task performed by the system learns a string transformation program to transform input rows of a table to URL strings that correspond to the webpages where the relevant data is present. A second sub-task learns a program in a rich web data extraction language to extract desired data from the webpage given the example extractions. Hierarchical search and input-driven ranking are used to efficiently learn the programs using few input-output examples. The learnt programs are then run on the remaining spreadsheet entries to join desired data from the corresponding web pages.
Public/Granted literature
- US20180232351A1 JOINING WEB DATA WITH SPREADSHEET DATA USING EXAMPLES Public/Granted day:2018-08-16
Information query