Library screening for cancer probability
Abstract:
A method, system, and computer program product are provided for generating a predictive model. A processor(s) obtains a raw data set (peptide libraries) of patients designated as diagnosed/pre-diagnosed with a condition or not diagnosed with the condition. The processor(s) segments the raw data set into a pre-defined number of groups and separates out a holdout group. The processor(s) performs a principal component analysis on the remaining groups to identify, based on a frequency of features in the remaining groups, common features (principal components) in the remaining groups and weighs the common features based on frequency of occurrence. The processor(s) determines a smallest number of the principal components that yields a pre-defined level of validation accuracy. The processor(s) generates a predictive model, by utilizing the smallest number for a best fit in a logistic regression model. The predictive model provides binary outcomes.
Public/Granted literature
Information query
Patent Agency Ranking
0/0