Invention Grant
- Patent Title: Method for automatically generating a wrapper for extracting web data, and a computer system
-
Application No.: US16630485Application Date: 2018-07-12
-
Publication No.: US11281729B2Publication Date: 2022-03-22
- Inventor: Georg Gottlob , Emanuel Sallinger , Ruslan Fayzrakhmanov , Tim Furche , Giovanni Grasso
- Applicant: OXFORD UNIVERSITY INNOVATION LIMITED
- Applicant Address: GB Oxford
- Assignee: OXFORD UNIVERSITY INNOVATION LIMITED
- Current Assignee: OXFORD UNIVERSITY INNOVATION LIMITED
- Current Assignee Address: GB Oxford
- Agency: Fresh IP PLC
- Agent Clifford D. Hyra; Aubrey Y. Chen
- Priority: GB1711315 20170713
- International Application: PCT/GB2018/051987 WO 20180712
- International Announcement: WO2019/012287 WO 20190117
- Main IPC: G06F16/951
- IPC: G06F16/951 ; G06F16/958 ; G06F9/54

Abstract:
Methods for automatically generating a wrapper for extracting web data and corresponding computer systems are disclosed. In one arrangement, a first wrapper is used to generate a second wrapper. The first wrapper extracts target data from one or more target web pages hosted by one or more target web servers. The second wrapper is capable of extracting the same target data from the same one or more target web pages without using a web browser engine to perform a) sending requests to the one or more target web servers, and/or b) processing replies from the one or more target web servers. The generation of the second wrapper comprises analysing one or both of the following: (i) code defining the first wrapper, (ii) interactions between the first wrapper and the one or more target web servers that occur during execution of the first wrapper.
Public/Granted literature
- US20200167393A1 METHOD FOR AUTOMATICALLY GENERATING A WRAPPER FOR EXTRACTING WEB DATA, AND A COMPUTER SYSTEM Public/Granted day:2020-05-28
Information query