Invention Grant
- Patent Title: Generating training documents
- Patent Title (中): 生成培训文件
-
Application No.: US13709773Application Date: 2012-12-10
-
Publication No.: US09165258B2Publication Date: 2015-10-20
- Inventor: Vinay Deolalikar , Hernan Laffitte
- Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
- Applicant Address: US TX Houston
- Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee Address: US TX Houston
- Agency: Trop, Pruner & Hu, P.C.
- Main IPC: G06N99/00
- IPC: G06N99/00 ; G06K9/46 ; G06K9/00

Abstract:
A method of generating training documents for training a classifying device comprises, with a processor, sampling from a distribution of words in a number of original documents, and creating a number of pseudo-documents from the distribution of words, the pseudo-documents comprising a similar distribution of words as the original documents. A device for classifying textual documents comprises a processor; and a memory communicatively coupled to the processor, the memory comprising a sampling module to, when executed by the processor, determine the distribution of words in a number of original documents, a pseudo-document creation module to, when executed by the processor, create a number of pseudo-documents from the distribution of words, the pseudo-documents comprising a similar distribution of words as the original documents, and a training module to, when executed by the processor, train the device to classify textual documents based on the pseudo-documents.
Public/Granted literature
- US20140164297A1 GENERATING TRAINING DOCUMENTS Public/Granted day:2014-06-12
Information query