Invention Grant
- Patent Title: Large scale unsupervised hierarchical document categorization using ontological guidance
- Patent Title (中): 使用本体论指导的大规模无监督层级文件分类
-
Application No.: US13022766Application Date: 2011-02-08
-
Publication No.: US08484245B2Publication Date: 2013-07-09
- Inventor: Viet Ha-Thuc , Jean-Michel Renders
- Applicant: Viet Ha-Thuc , Jean-Michel Renders
- Applicant Address: US CT Norwalk
- Assignee: Xerox Corporation
- Current Assignee: Xerox Corporation
- Current Assignee Address: US CT Norwalk
- Agency: Fay Sharpe LLP
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A classification method includes constructing queries from category descriptors representing categories of a taxonomy of hierarchically organized categories. The query constructed for a category c includes a query component based on descriptors of the category c and at least one query component based on descriptors of an ancestor or descendant category of the category c. A documents database is queried using the constructed queries to retrieve pseudo-relevant documents. Language models for the categories of the taxonomy are extracted from the pseudo-relevant documents by inferring a hierarchical topic model representing the taxonomy. An input document is classified by optimizing mixture weights of a weighted combination of categories of the hierarchical topic model respective to the input document.
Public/Granted literature
- US20120203752A1 LARGE SCALE UNSUPERVISED HIERARCHICAL DOCUMENT CATEGORIZATION USING ONTOLOGICAL GUIDANCE Public/Granted day:2012-08-09
Information query