Analyzing large-scale data processing jobs
Abstract:
Methods, systems, and apparatus for data analysis in a distributed computing system by accessing data stored at a first processing zone associated with a distributed data processing job, detecting information identifying a particular child job associated with the distributed data processing job, comparing the identifying information to data stored at a second processing zone, and identifying an additional child job as associated with the distributed data processing job based on a result of the comparison. The methods, systems and apparatus are further for correlating particular output data associated with the particular child job and additional output data associated with the additional child job for the distributed data processing job, determining performance data for the distributed data processing job based on the output data associated with each of the particular child job and the additional child job, and providing for display the performance data for the distributed data processing job.
Public/Granted literature
Information query
Patent Agency Ranking
0/0