METHOD FOR RETRIEVING AND RANKING DOCUMENT FROM DATABASE, COMPUTER SYSTEM, AND RECORDING MEDIUM

    公开(公告)号:JP2002024268A

    公开(公告)日:2002-01-25

    申请号:JP2000175848

    申请日:2000-06-12

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To retrieve and/or rank a document including attribute data in a database to which the document is added. SOLUTION: This method includes a step where a document matrix D including numeral elements derived from attribute data is generated from the document, a step where a covariance matrix K is formed from the document matrix D, a step where the covariance matrix K is decomposed into characteristic values, a step where the dimensions of a matrix V are decreased by using a specific number of characteristic vectors included in the matrix V and including the characteristic vector corresponding to the maximum characteristic value, a step where the dimensions of the document matrix D are decreased by using the matrix V having the dimensions decreased, and a step where documents in the database are retrieved and/or ranked by calculating the internal product of the document matrix D having the dimensions decreased and a query vector.

    INFORMATION RETRIEVAL SYSTEM AND METHOD, PROGRAM FOR PERFORMING INFORMATION RETRIEVAL, AND RECORDING MEDIUM FOR RECODING PROGRAM FOR PERFORMING INFORMATION RETRIEVAL

    公开(公告)号:JP2003141160A

    公开(公告)日:2003-05-16

    申请号:JP2001324437

    申请日:2001-10-23

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide an information retrieval system, an information retrieval method, a program for performing the information retrieval, and a recording medium for recording a program for performing the information retrieval. SOLUTION: This information retrieval system includes a means for producing and holding a specific matrix from a document-attribute matrix, a means for producing a document-attribute sub-matrix from a document added to the database for a predetermined period, a means for updating the specific matrix by using the information relating to the document-attribute sub-matrix, performing the singular value decomposition on the updated specific matrix, and reducing the dimension of all the document-attribute matrixes held in the database, and a means for performing the information retrieval by a query inputted by the user by using the document-attribute matrix of which the dimension reduced.

    INFORMATION VISUALIZING SYSTEM AND METHOD, PROGRAM FOR THE SAME, RECORDING MEDIUM FOR RECORDING PROGRAM, AND INFORMATION RETRIEVING SERVICE SYSTEM

    公开(公告)号:JP2003141165A

    公开(公告)日:2003-05-16

    申请号:JP2001329613

    申请日:2001-10-26

    Applicant: IBM

    Abstract: PROBLEM TO BE SOLVED: To provide an information visualizing system and method, its program, a recording medium for recording the program and an information retrieving service system. SOLUTION: In this information visualizing system for visualizing and displaying the predetermined information held as the digitalized matrix in the database, the system includes a means for performing the singular value decomposition to the matrix to produce a singular vector, a means for reducing dimensions of the matrix on the basis of the singular vector, a means for giving the selection vector for selecting the specific information, and a means for reducing the dimension of the selected vector on the basis of the singular vector, and selecting the predetermined number of elements of the selected vector on which the dimension is reduced in descending order from larger one.

    Method and system for retrieving, detecting and identifying main cluster and outlier cluster in large scale database, recording medium and server
    5.
    发明专利
    Method and system for retrieving, detecting and identifying main cluster and outlier cluster in large scale database, recording medium and server 有权
    记录媒体和服务器的大规模数据库中检索和识别主集群和外部集群的方法和系统

    公开(公告)号:JP2003030222A

    公开(公告)日:2003-01-31

    申请号:JP2001205183

    申请日:2001-07-05

    Abstract: PROBLEM TO BE SOLVED: To provide a method and a system for detecting, retrieving and identifying a main cluster and an outlier cluster in a large scale database, and to provide a recording medium and a server.
    SOLUTION: This method includes a step for generating a document matrix from a preceding document by using at least one attribute, a step for generating a residual matrix scaled on the basis of the document matrix from a prescribed function, a step for performing singular value decomposition to obtain a base vector corresponding to a maximum singular value, a step for reconstructing the residual matrix, dynamically scaling the reconstructed residual matrix and obtaining another base vector, a step for repeating from the singular value decomposition step to the reconstruction step to generate a set of prescribed base vectors, and a step for performing dimensional reduction of the document matrix and detecting, retrieving and identifying a document in a database.
    COPYRIGHT: (C)2003,JPO

    Abstract translation: 要解决的问题:提供用于检测,检索和识别大规模数据库中的主群集和离群群的方法和系统,并提供记录介质和服务器。 解决方案:该方法包括通过使用至少一个属性从前一个文档生成文档矩阵的步骤,用于根据规定的函数生成基于文档矩阵缩放的残差矩阵的步骤,用于执行奇异值分解的步骤 以获得对应于最大奇异值的基本矢量,用于重构残余矩阵的步骤,动态地缩放重建的残余矩阵并获得另一个基本向量;从奇异值分解步骤重建到重建步骤以生成集合的步骤 规定的基本向量,以及用于执行文档矩阵的尺寸缩小并检测,检索和识别数据库中的文档的步骤。

Patent Agency Ranking