Systems and methods for generating distributed software packages using non-distributed source code
Abstract:
Systems and methods are provided for transcompiling non-distributed source code for a non-distributed software program into a distributed software package for implementation on a distributed computing system. A transcompiler can identify loops within non-distributed source code written in a data-driven language. The transcompiler can generate MapReduce jobs using mapper keys based on grouping indicators associated with each of the loops. The MapReduce jobs can be linked together based on input-output connections of the loops in the non-distributed source code. Then, the transcompiler can generate a distributed software package including the generated MapReduce jobs to implement the same functionality as the non-distributed source code on the distributed computing system, thereby improving the speed of execution over very large datasets. The distributed software package can be optimized using machine learning searching algorithms. The distributed software package can also be optimized based on execution usage statistics.
Information query
Patent Agency Ranking
0/0