Invention Grant
- Patent Title: Accelerating data profiling process
- Patent Title (中): 加速数据剖析过程
-
Application No.: US13645730Application Date: 2012-10-05
-
Publication No.: US08719271B2Publication Date: 2014-05-06
- Inventor: Sebastian Nelke , Martin Oberhofer , Yannick Saillet , Jens Seifert
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Law Office of Jim Boice
- Priority: EP11184173 20111006
- Main IPC: G06F17/30
- IPC: G06F17/30

Abstract:
A data profile request is handles by utilizing data in a distributed file system. Tabular data is extracted from a data source and stored in a distributed file system. Each table in the tabular data is split by columns, which are each stored in separate files in a set of physical nodes of the distributed file system. In response to a data profiling request, a master node determines, based on the profiling request, which groups of files are needed to be on a same physical node in order to perform the profiling analysis. The master node creates jobs using physical nodes that contain the requisite files needed for each job.
Public/Granted literature
- US20130091094A1 ACCELERATING DATA PROFILING PROCESS Public/Granted day:2013-04-11
Information query