Invention Grant
- Patent Title: Optimizing insight generation in heterogeneous datasets
-
Application No.: US16796996Application Date: 2020-02-21
-
Publication No.: US11630849B2Publication Date: 2023-04-18
- Inventor: Sudheesh S. Kairali , Ankur Tagra
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Lieberman & Brandsdorfer, LLC
- Main IPC: G06F16/00
- IPC: G06F16/00 ; G06F16/28 ; G06F16/2457 ; G06F16/26

Abstract:
Embodiments relate to a system, computer program product, and method to merge two or more heterogeneous datasets. Seed attributes of each dataset that is the subject of the merge are identified. The seed attributes are derived from candidate attributes of the respective datasets. A correlation is assessed to create a set of mergeable attributes and a set of non-mergeable attributes. A cohesiveness characteristic is leveraged to iteratively identify one or more attributes from the set of non-mergeable attributes, and to amend the set of mergeable attributes with the one or more attributes identified in the set of non-mergeable attributes. A merged dataset based on the amended set of mergeable attributes and representing non-trivial similarities between the first and second dataset is formed as output.
Public/Granted literature
- US20210263951A1 Optimizing Insight Generation in Heterogeneous Datasets Public/Granted day:2021-08-26
Information query