Invention Grant
US07987172B1 Minimizing visibility of stale content in web searching including revising web crawl intervals of documents
有权
最小化网页搜索中陈旧内容的可见性,包括修改文档的网页爬网间隔
- Patent Title: Minimizing visibility of stale content in web searching including revising web crawl intervals of documents
- Patent Title (中): 最小化网页搜索中陈旧内容的可见性,包括修改文档的网页爬网间隔
-
Application No.: US10930280Application Date: 2004-08-30
-
Publication No.: US07987172B1Publication Date: 2011-07-26
- Inventor: Anton P. T. Carver
- Applicant: Anton P. T. Carver
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Morgan, Lewis & Bockius LLP
- Main IPC: G06F7/00
- IPC: G06F7/00 ; G06F17/30

Abstract:
A method and system is disclosed for associating an appropriate web crawl interval with a document so that the probability of the document's stale content being used by a search engine is below an acceptable level when the search engine crawls the document at its associated web crawl interval. The web crawl interval of a document is determined through an iterative process and updated dynamically by the search engine after every visit to the document by a web crawler. A multi-tier data structure is employed for managing the web crawl order of billions of documents on the Internet. The search engine may move a document from one tier to another if its web crawl interval is changed significantly.
Information query