Abstract:
PROBLEM TO BE SOLVED: To perform document recommendation from documents in collection based on multi-modal user clusters. SOLUTION: An initial set of users are identified, the documents in the collection accessed by the users are identified, the contents of the accessed documents are estimated from the accessed documents for the users, and the users are clustered into a plurality of user clusters by expressing the users using the contents of the documents. A new user is identified, information on the documents accessed by the new user is collected, the contents of the documents accessed by the new user is estimated from the documents accessed by the new user, and the new user is assigned to the user clusters based on similarity between the contents of the documents accessed by the new user and the contents of the documents accessed by other users included in the user clusters. COPYRIGHT: (C)2011,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To visualize a cluster of users represented by a document selected from a collection of documents. SOLUTION: A plurality of users selected in a user group are identified, and the plurality of users share interest decided through multi-mode collection use analysis. For each user of the plurality of users, a corresponding access probability representing the frequency of user's access to documents for each document in a document collection is decided. For each document in the document collection, a set access probability in the plurality of selected users corresponding to the probability that a user in a plurality of selected users accesses the document is computed. A disk tree having a plurality of nodes each representing a document in the document collection is displayed. Each node in the disk tree having a set access probability larger than a desired threshold value is highlighted. COPYRIGHT: (C)2010,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To recommend a preferable document to a user by clustering a second object by means of a result obtained by means of separating and searching a feature corresponding to the first plural objects in a collection. SOLUTION: The training set of users is identified from the collection of data (2510) and all types of information being usable concerning the users are collected. Then the users are clustered with multi-mode information in a selection related to multi-mode clustering (2512). In this case, unless the new user exists (2514), the processing is ended. Or when the new user is identified (2518), browsing information is collected from the new user (2520) and the user is assigned to the nearest existing cluster (2522). Then the most popular page in the nearest cluster is identified (2524) and it is recommended to the new user (2526).
Abstract:
PROBLEM TO BE SOLVED: To select a set of initial cluster centers in wavefront clustering of objects in a collection. SOLUTION: This method for selecting the set of initial cluster centers selects the set of initial cluster centers in the wavefront clustering of the objects in the collection, wherein each object is represented by a set of vectors with multi-modal features. A first number of a first object is selected from the objects in the collection, the vector centroid of the first object is calculated using the set of vectors with multi-modal features associated with each object, and a second number of a second object is selected from the objects in the collection. The second number of the initial cluster centers between the centroid and the second object is identified. The wavefront clustering is performed to the objects in the collection using the second number of the initial cluster centers. COPYRIGHT: (C)2011,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To calculate a degree of similarity between two objects in the collection of objects. SOLUTION: Two objects in collection which are related with a first feature vector and a second feature vector have two or more dimensions. The first feature vector expresses a first feature of the object, and the second feature vector expresses a second feature of the object. The first feature is a text feature, and the second feature is an image feature. The first feature vector of the first object and the first feature vector of the second object are identified, and the first distance metric between these first feature vectors is calculated. The second feature vector of the first object and the second feature vector of the second object are identified, and the second distance metric between these second feature vectors is calculated. The total of the first distance metric and the second distance metric is calculated. COPYRIGHT: (C)2011,JPO&INPIT
Abstract:
PROBLEM TO BE SOLVED: To represent digital documents in a vector space by using numerical numbers. SOLUTION: A first digital document to be processed is identified from a plurality of digital documents, and a first characteristic provided with a text that encloses images included in the digital document and is not anchor text, and corresponding to the first digital document is extracted from the plurality of digital documents. The first characteristic is converted into a first vector, and the first vector is associated with the first digital document. COPYRIGHT: (C)2010,JPO&INPIT