Invention Grant
- Patent Title: Data classification
-
Application No.: US16876660Application Date: 2020-05-18
-
Publication No.: US11748382B2Publication Date: 2023-09-05
- Inventor: Yannick Saillet , Namit Kabra , Mike W. Grasselt , Krishna Kishore Bonagiri
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Grant Johnson
- Priority: EP 188266 2019.07.25
- Main IPC: G06F16/28
- IPC: G06F16/28 ; G06F16/2457 ; G06F16/22 ; G06N20/00 ; G06F16/248 ; G06F18/214 ; G06N7/01

Abstract:
A method provides for classifying data fields of a dataset. A classifier configured for determining confidence values for a plurality of data classes for the data fields may be applied. Using the confidence values, data class candidates may be identified. Data fields may be determined for which a plurality of data class candidates is identifiable. Using previous user-selected data class assignments, a probability may be determined for the data class candidates that the respective data class candidate is a data class to which the respective data field is to be assigned. The data fields may be classified using the probabilities to select for the data fields a data class from the data class candidates. The dataset may be provided with metadata identifying for the data fields the data classes to which the respective data fields are assigned.
Public/Granted literature
- US20210026872A1 DATA CLASSIFICATION Public/Granted day:2021-01-28
Information query