Parallel scoring of an ensemble model
Abstract:
Method and systems for parallel scoring an ensemble model are provided. Aspects include loading data into a first distributed data structure having a plurality of partitions, each partition having loaded data in the form of a set of pairs of data formed of a record to be scored and a partial score for that record. A component model in the ensemble model is selected and processing of the records carried out in parallel across the partitions including updating the partial score for each record. In response to a partial score for a record not meeting an accuracy threshold, the method retains the record in the first distributed data structure to be scored by a subsequent component model. In response to the partial score for a record meeting the accuracy threshold, the method moves the record and updated partial score to an output result data structure to provide a final score.
Public/Granted literature
Information query
Patent Agency Ranking
0/0