Similarity determination apparatus, similarity determination method, and computer-readable recording medium
Abstract:
A determination apparatus has a feature extraction unit and a similarity determination unit. The feature extraction unit counts a number of appearances of each keyword included in a piece of document information and deletes any arrangement including a keyword having the number of appearances less than a threshold under a condition where a number of types of keyword arrangements included in a certain range of the piece of document information is equal to or greater than a certain number and extracts, as features, a plurality of keyword arrangements from the piece of document information. The similarity determination unit determines a similarity between the different pieces of document information by comparing the features extracted from pieces of document information different from each other.
Information query
Patent Agency Ranking
0/0