Invention Grant
US08868541B2 Scheduling resource crawls 有权
调度资源爬网

Scheduling resource crawls
Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for scheduling resource crawls. In one aspect, a framework is provided for scheduling resource crawls such that a crawl scheduler determines the health of a document, i.e., whether it can be crawled, the popularity of the document, and the frequency of “interesting,” i.e., substantive, content changes, and based on this information, estimates an appropriate crawl interval for each web resource to improve crawl resource utilization.
Public/Granted literature
Information query
Patent Agency Ranking
0/0