Invention Grant
- Patent Title: Detecting correlation from data
- Patent Title (中): 从数据中检测相关性
-
Application No.: US10864463Application Date: 2004-06-10
-
Publication No.: US07647293B2Publication Date: 2010-01-12
- Inventor: Paul Geoffrey Brown , Peter Jay Haas , Ihab F. Ilyas , Volker G. Markl
- Applicant: Paul Geoffrey Brown , Peter Jay Haas , Ihab F. Ilyas , Volker G. Markl
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: IP Authority, LLC
- Agent Ramraj Soundararajan
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A system and method of discovering dependencies between relational database column pairs and application of discoveries to query optimization is provided. For each candidate column pair remaining after simultaneously generating column pairs, pruning pairs not satisfying specified heuristic constraints, and eliminating pairs with trivial instances of correlation, a random sample of data values is collected. A candidate column pair is tested for the existence of a soft functional dependency (FD), and if a dependency is not found, statistically tested for correlation using a robust chi-squared statistic. Column pairs for which either a soft FD or a statistical correlation exists are prioritized for recommendation to a query optimizer, based on any of: strength of dependency, degree of correlation, or adjustment factor; statistics for recommended columns pairs are tracked to improve selectivity estimates. Additionally, a dependency graph representing correlations and dependencies as edges and column pairs as nodes is provided.
Public/Granted literature
- US20050278357A1 Detecting correlation from data Public/Granted day:2005-12-15
Information query