Invention Grant
US08407204B2 Minimizing visibility of stale content in web searching including revising web crawl intervals of documents
有权
最小化网页搜索中陈旧内容的可见性,包括修改文档的网页爬网间隔
- Patent Title: Minimizing visibility of stale content in web searching including revising web crawl intervals of documents
- Patent Title (中): 最小化网页搜索中陈旧内容的可见性,包括修改文档的网页爬网间隔
-
Application No.: US13166757Application Date: 2011-06-22
-
Publication No.: US08407204B2Publication Date: 2013-03-26
- Inventor: Anton P. T. Carver
- Applicant: Anton P. T. Carver
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Morgan, Lewis & Bockius LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A method and system is disclosed for associating an appropriate web crawl interval with a document so that the probability of the document's stale content being used by a search engine is below an acceptable level when the search engine crawls the document at its associated web crawl interval. The web crawl interval of a document is determined through an iterative process and updated dynamically by the search engine after every visit to the document by a web crawler. A multi-tier data structure is employed for managing the web crawl order of billions of documents on the Internet. The search engine may move a document from one tier to another if its web crawl interval is changed significantly.
Public/Granted literature
Information query