Invention Grant
- Patent Title: Method and vector analysis for a document
- Patent Title (中): 文档的向量分析方法
-
Application No.: US12424801Application Date: 2009-04-16
-
Publication No.: US08171026B2Publication Date: 2012-05-01
- Inventor: Takahiko Kawatani
- Applicant: Takahiko Kawatani
- Applicant Address: US TX Houston
- Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee: Hewlett-Packard Development Company, L.P.
- Current Assignee Address: US TX Houston
- Priority: JP2000-353475 20001120
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
The invention provides a document representation method and a document analysis method including extraction of important sentences from a given document and/or determination of similarity between two documents.The inventive method detects terms that occur in the input document, segments the input document into document segments, each segment being an appropriately sized chunk and generates document segment vectors, each vector including as its element values according to occurrence frequencies of the terms occurring in the document segments. The method further calculates eigenvalues and eigenvectors of a square sum matrix in which a rank of the respective document segment vector is represented by R and selects from the eigenvectors a plural (L) of eigenvectors to be used for determining the importance. Then, the method calculates a weighted sum of the squared projections of the respective document segment vectors onto the respective selected eigenvectors and selects document segments having the significant importance based on the calculated weighted sum of the squared projections of the respective document segment vectors.
Public/Granted literature
- US20090216759A1 METHOD AND VECTOR ANALYSIS FOR A DOCUMENT Public/Granted day:2009-08-27
Information query