Invention Grant
- Patent Title: Identifying source datasets that fit a transfer learning process for a target domain
-
Application No.: US16934492Application Date: 2020-07-21
-
Publication No.: US11308077B2Publication Date: 2022-04-19
- Inventor: Bar Haim , Andrey Finkelshtein , Eitan Menahem , Noga Agmon
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Barry D. Blount
- Main IPC: G06F16/00
- IPC: G06F16/00 ; G06F16/23 ; G06F16/22 ; G06K9/62 ; G06N20/00

Abstract:
A method for quantifying a similarity between a target dataset and multiple source datasets and identifying one or more source datasets that are most similar to the target dataset is provided. The method includes receiving, at a computing system, source datasets relating to a source domain and a target dataset relating to a target domain of interest. Each dataset is arranged in a tabular format including columns and rows, and the source datasets and the target dataset include a same feature space. The method also includes pre-processing, via a processor of the computing system, each source-target dataset pair to remove non-intersecting columns. The method further includes calculating at least two of a dataset similarity score, a row similarity score, and a column similarity score for each source-target dataset pair, and summarizing the calculated similarity scores to identify one or more source datasets that are most similar to the target dataset.
Public/Granted literature
- US20220027339A1 IDENTIFYING SOURCE DATASETS THAT FIT A TRANSFER LEARNING PROCESS FOR A TARGET DOMAIN Public/Granted day:2022-01-27
Information query