Systems and methods for spark lineage data capture
Abstract:
Systems and methods for SPARK lineage data capture are disclosed. In one embodiment, in an information processing apparatus comprising at least one computer processor, a method for lineage data capture may include: (1) receiving, at a lineage engine and from a listener service, a decisive logical plan for a job; (2) extracting, using a plan parser, lineage data from the decisive logical plan; (3) producing, by a job lineage builder, job lineage data and job attribute data from the lineage data; (4) extracting, by the job lineage builder and from the job lineage data and the job attribute data, attribute information, transformation information, and estimate information for the job; and (5) storing, in a database, the attribute information, the transformation information, and the estimate information.
Public/Granted literature
Information query
Patent Agency Ranking
0/0