Invention Grant
- Patent Title: Preparing high-quality data repositories sets utilizing heuristic data analysis
-
Application No.: US15420474Application Date: 2017-01-31
-
Publication No.: US11194772B2Publication Date: 2021-12-07
- Inventor: Neil E. Bartlett , Craig A. Statchuk
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agent Stephen J. Walder, Jr.; Alexander G. Jochym
- Main IPC: G06F16/21
- IPC: G06F16/21 ; G06F16/25 ; G06F16/23

Abstract:
A mechanism is provided for preparing a high-quality data repository. Data and related metadata from a set of data sources are ingested thereby forming a set of unprepared data. The set of unprepared data is transformed based on a set of functions into a set of transformed data. A set of semantic text descriptions that detail the transformation of the set of unprepared data to the set of transformed data is generated using a first set of semantic associations, a second set of semantic associations, and a set of semantic transformation associations. The set of transformed data is tested against one or more governance policies that tracks data lineage to ultimately show that prepared data is in compliance. Responsive to the set of transformed data adhering to the one or more governance policies, a high-quality data repository is automatically built using the transformed data.
Public/Granted literature
- US20170140016A1 Preparing High-Quality Data Repositories Sets Utilizing Heuristic Data Analysis Public/Granted day:2017-05-18
Information query