Invention Grant
- Patent Title: Semantic data type classification in rectangular datasets
-
Application No.: US17184122Application Date: 2021-02-24
-
Publication No.: US11556514B2Publication Date: 2023-01-17
- Inventor: Roger C. Raphael , Mu Qiao , Scott Schumacher , Angineh Aghakiant
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Peter K. Suchecki
- Main IPC: G06F16/22
- IPC: G06F16/22 ; G06F16/25 ; G06F16/21 ; G06N20/00 ; G06F40/30 ; G06K9/62

Abstract:
Provided is a method, computer program product, and system for automatically predicting unknown semantic data types in a rectangular dataset using a holistic knowledge of said dataset. A processor may receive one or more rectangular datasets, the one or more rectangular datasets comprising a plurality of columns having a set of known semantic data types. The processor may extract a set of features from the plurality of columns, where the set of features is used to determine a relationship among each column of the plurality of columns. The processor may construct a set of training data based on the extracted set of features. Using the training data, the processor may train a machine learning model to predict a semantic data type of a target column in a rectangular dataset having an unknown semantic data type.
Public/Granted literature
- US20220269663A1 SEMANTIC DATA TYPE CLASSIFICATION IN RECTANGULAR DATASETS Public/Granted day:2022-08-25
Information query