Streamlining data processing optimizations for machine learning workloads
Abstract:
Techniques for refinement of data pipelines are provided. An original file of serialized objects is received, and an original pipeline comprising a plurality of transformations is identified based on the original file. A first computing cost is determined for a first transformation of the plurality of transformations. The first transformation is modified using a predefined optimization, and a second cost of the modified first transformation is determined. Upon determining that the second cost is lower than the first cost, the first transformation is replaced, in the original pipeline, with the optimized first transformation.
Information query
Patent Agency Ranking
0/0