Invention Grant
US08543576B1 Classification of clustered documents based on similarity scores 有权
基于相似度分数的聚类文档分类

Classification of clustered documents based on similarity scores
Abstract:
Among other disclosed subject matter, a computer-implemented method that includes receiving a set of clusters of documents and calculating a similarity score for each cluster wherein the similarity score is based at least in part on features included in the documents in the cluster and indicates a measure of similarity of the documents in the cluster. For each cluster associated with a respective similarity score greater than a first threshold, identifying the cluster as satisfying a quality assurance requirement. For each cluster associated with a respective similarity score less than a second threshold, identifying the cluster as failing the quality assurance requirement. For each cluster associated with a similarity score less than or equal to the first threshold value and greater than or equal to the second threshold value, reviewing at least a subset of documents in the cluster to determine whether the cluster satisfies the quality assurance requirement.
Information query
Patent Agency Ranking
0/0