Invention Grant
US08972376B1 Optimized web domains classification based on progressive crawling with clustering 有权
基于逐步爬行与聚类优化的网域分类

Optimized web domains classification based on progressive crawling with clustering
Abstract:
Techniques for optimized web domains classification based on progressive crawling with clustering are disclosed. In some embodiments, optimized web domains classification based on progressive crawling with clustering includes crawling a domain (e.g., a web site domain) to collect data for a subset of pages (e.g., web pages) of a corpus of content associated with the domain; classifying each of the crawled pages into one or more category clusters, in which the category clusters represent a content categorization of the corpus of content associated with the domain (e.g., a URL content categorization for the domain, host of that domain, and/or directory of that domain); and determining which of the one or more category clusters to publish for the domain.
Information query
Patent Agency Ranking
0/0