Ranking explanatory variables in multivariate analysis
Abstract:
A computer-implemented method, a computer program product, and a computer system for ranking explanatory variables in multivariate analysis. A computer system extracts words from documents related to categories, creates a histogram of the words in each category, and selects top words in each histogram, where the top words are used as representing words in each category. A computer system generates respective feature vectors of explanatory variable candidates and a feature vector of an objective variable, where a feature vector of a corresponding variable includes elements corresponding to respective ones of the categories and a value of element indicates whether a name of the corresponding variable is included in the top words. A computer system calculates cosine similarity between each of the respective feature vectors of the explanatory variable candidates and the feature vector of the objective variable. A computer system ranks the explanatory variable candidates, based on the cosine similarity.
Public/Granted literature
Information query
Patent Agency Ranking
0/0