enriczhang / binsembler Goto Github PK
View Code? Open in Web Editor NEWThis project forked from umeshnmenon/binsembler
Binsembler - A Binwise Ensembler. In general, Ensemble techniques combine the perspective of various models by aggregating the predictions output by each of these models thus tend to enhance the overall prediction accuracy. Simple techniques such as taking majority vote or simple averaging of the predicted probabilities or weighted averaging of predicted probabilities based on model’s F1 score or Accuracy or any other measure are the popular choices to ensemble the model predictions. Here we propose a novel approach based on aggregating the predicted probabilities as weighted averages where weights are the performance statistic based on bins the probabilities fall in. Idea is to divide the predicted probabilities of each model on a validation set into equal sized bins (preferably deciles) and calculate the metrics in each bin. Pick any one metric, and note down it for each bin in a mapping table. This will be the weight used in our weighted ensemble approach. When the prediction to be made on new data, first map the predicted probabilities for the new data to an appropriate bin and then pick the respective metric value for that bin from the mapping table and multiply with the predicted probability. Repeat the same for second model. Finally calculate a new predicted probability as the weighted average.