Method for determining output data for a plurality of text documents
Abstract:
Provided is a method for determining output data for a plurality of text documents, including the steps of: providing a feature matrix as input data; wherein the feature matrix includes information about frequencies of a plurality of features within the plurality of text documents; clustering the feature matrix using a clustering algorithm into at least one clustering matrix; wherein the at least one clustering matrix includes information about the cluster membership of each document of the plurality of documents or each feature of the plurality of features, assigning at least one score to each feature of the plurality of features based on the at least one clustering matrix; ranking the plurality of features based on their assigned scores; and outputting the ranked features as output data. A corresponding computer program product and system is also provided.
Information query
Patent Agency Ranking
0/0