Invention Grant
- Patent Title: Automatic incremental labeling of document clusters
- Patent Title (中): 文档集群的自动增量标签
-
Application No.: US13530764Application Date: 2012-06-22
-
Publication No.: US09002848B1Publication Date: 2015-04-07
- Inventor: Jun Peng , Aner Ben-Artzi , Kirill Buryak , Glenn M. Lewis
- Applicant: Jun Peng , Aner Ben-Artzi , Kirill Buryak , Glenn M. Lewis
- Applicant Address: US CA Mountain View
- Assignee: Google Inc.
- Current Assignee: Google Inc.
- Current Assignee Address: US CA Mountain View
- Agency: Armstrong Teasdale LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
Methods and systems for use in labeling documents within a cluster are provided. One example method includes assembling a set of documents including a first plurality of previously clustered documents and a second plurality of documents. Each of the first plurality of previously clustered documents has at least one label identifying a topic to which content of the document relates. The method includes partitioning documents from the set of documents into multiple clusters, determining if a dominant topic exists within one of the multiple clusters, determining a metric value for one of the multiple clusters based on the number of documents within the one of the multiple clusters having a label identifying the determined dominant topic, and labeling at least documents from the second plurality of documents within the one of the multiple clusters with the label identifying the dominant topic when the metric value exceeds a predetermined threshold.
Information query