Machine learning systems and methods to diagnose rare diseases
Abstract:
A machine-learned model to diagnose patients with a rare disease based on medical data/records, and methods of training such a model are disclosed. A computer implemented method is disclosed for generating a training dataset for training a machine-learning model to identify individuals with a rare disease. The method comprises: receiving an initial dataset comprising medical data relating to a plurality of individuals with the rare disease; identifying a plurality of clusters of individuals in the initial dataset; identifying one or more of the clusters as being least representative of the rare disease; removing one or more of the individuals from the one or more clusters identified as being least representative based on the medical data of said one or more individuals to generate a pruned dataset; and combining the pruned dataset with a control dataset comprising a plurality of individuals without the rare disease to generate the training dataset.
Public/Granted literature
Information query
Patent Agency Ranking
0/0