- Patent Title: Training machine learning models to exclude ambiguous data samples
-
Application No.: US16983161Application Date: 2020-08-03
-
Publication No.: US11556742B2Publication Date: 2023-01-17
- Inventor: Dana Levanony , Tal Tlusty Shapiro
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Stephanie L. Carusillo
- Main IPC: G06K9/62
- IPC: G06K9/62 ; G06N20/00 ; G16H30/20

Abstract:
Techniques for training machine learning models for improved accuracy at classifying medical imaging data sets by trimming ambiguous samples from training data sets are described herein. In some embodiments, a machine learning model is trained using a data set, where a subset of the data set comprises data with a conflict between a first label based on an expert opinion and a second label based on a ground truth based on a medical examination. During some epochs of training the machine learning model, loss values for each data sample in the epoch are compared against a loss threshold, with data samples with corresponding loss values that exceed the loss threshold that also belong to the subclass trimmed from the data set for subsequent epochs of training. The loss threshold for the next epoch is then updated based on loss values of the trimmed data set.
Public/Granted literature
- US20220036128A1 TRAINING MACHINE LEARNING MODELS TO EXCLUDE AMBIGUOUS DATA SAMPLES Public/Granted day:2022-02-03
Information query