Invention Grant
- Patent Title: System, method and computer readable medium for web crawling
-
Application No.: US13287535Application Date: 2011-11-02
-
Publication No.: US09940391B2Publication Date: 2018-04-10
- Inventor: Robert R Hauser
- Applicant: Robert R Hauser
- Applicant Address: US CA Redwood Shores
- Assignee: ORACLE AMERICA, INC.
- Current Assignee: ORACLE AMERICA, INC.
- Current Assignee Address: US CA Redwood Shores
- Agency: Kilpatrick Townsend & Stockton LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
In a web crawler, a URL selection module selects URLs for pages to be downloaded. The URL selection module accesses an interaction data store that stores interaction data for web pages, including interaction data that indicates human interactions with the pages. To reduce the effects of link farms, the URL selection module filters the URLs to select only those URLs that have human interaction histories and provides the selected URLs to a download module for web page downloading.
Public/Granted literature
- US20120047122A1 SYSTEM, METHOD AND COMPUTER READABLE MEDIUM FOR WEB CRAWLING Public/Granted day:2012-02-23
Information query