Invention Grant
- Patent Title: Heterogeneous schema discovery for unstructured data
-
Application No.: US17807884Application Date: 2022-06-21
-
Publication No.: US11947561B2Publication Date: 2024-04-02
- Inventor: Peng Hui Jiang , Jun Su , Sheng Yan Sun , Hong Mei Zhang , Meng Wan
- Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Applicant Address: US NY Armonk
- Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
- Current Assignee Address: US NY Armonk
- Agent Daniel G. DeLuca
- Main IPC: G06F16/25
- IPC: G06F16/25 ; G06F16/21 ; G06F16/242 ; G06F16/28

Abstract:
An embodiment for analyzing and tracking data flow to determine proper schemas for unstructured data. The embodiment may automatically use a sidecar to collect schema discovery rules during conversion of raw data to unstructured data. The embodiment may automatically generate multiple schemas for different tenants using the collected schema discovery rules. The embodiment may automatically use ETL to export unstructured data to SQL databases with the generated multiple schemas for the different tenants. The embodiment may automatically monitor usage data of the SQL databases and collect the usage data. The embodiment may automatically optimize schema discovery using the collected usage data. The embodiment may automatically discover schemas with hot usage and apply the discovered schemas with hot usage to other tenants for consumption and further monitoring.
Public/Granted literature
- US20230409593A1 HETEROGENEOUS SCHEMA DISCOVERY FOR UNSTRUCTURED DATA Public/Granted day:2023-12-21
Information query