Invention Grant
- Patent Title: Repairing data through domain knowledge
-
Application No.: US16161695Application Date: 2018-10-16
-
Publication No.: US10970271B2Publication Date: 2021-04-06
- Inventor: Kris Kuppuswamy Ganjam , Yeye He , Anja Gruenheid
- Applicant: Microsoft Technology Licensing, LLC
- Applicant Address: US WA Redmond
- Assignee: Microsoft Technology Licensing, LLC
- Current Assignee: Microsoft Technology Licensing, LLC
- Current Assignee Address: US WA Redmond
- Agency: Workman Nydegger
- Main IPC: G06F16/23
- IPC: G06F16/23 ; G06F16/215 ; G06F16/28 ; G06F16/35 ; G06F16/2457

Abstract:
Correcting data in a dataset. A set of data tokens from a tabular data store are grouped into a plurality of different clusters based on similarity of tokens. A reference cluster is selected from among the plurality of different clusters such that the plurality of clusters includes a reference cluster and one or more other clusters. One or more tokens in the one or more other clusters are transformed. The effect on the reference cluster of adding the transformed tokens to the reference cluster is determined. Using this information, a correction for a token in the dataset is identified. The data store is updated to correct the token using the identified correction.
Public/Granted literature
- US20190050447A1 Repairing Data Through Domain Knowledge Public/Granted day:2019-02-14
Information query