Invention Grant
US08214359B1 Detecting query-specific duplicate documents 有权
检测特定于查询的重复文档

Detecting query-specific duplicate documents
Abstract:
An improved duplicate detection technique that uses query-relevant information to limit the portion(s) of documents to be compared for similarity is described. Before comparing two documents for similarity, the content of these documents may be condensed based on the query. In one embodiment, query-relevant information or text (also referred to as “snippets”) is extracted from the documents and only the extracted snippets, rather than the entire documents, are compared for purposes of determining similarity.
Information query
Patent Agency Ranking
0/0