Ensemble methods: Bagging & Boosting

Such situations are taken care of by boosting.

Boosting is a sequential process, where each subsequent model attempts to correct the errors of the previous model.

The succeeding models are dependent on the previous model.

Let’s understand the way boosting works in the below steps.

A subset is created from the original dataset.

Initially, all data points are given equal weights.

A base model is created on this subset.

This model is used to make predictions on the whole dataset.

5.

Errors are calculated using the actual values and predicted values.

6.

The observations which are incorrectly predicted, are given higher weights.

(Here, the three misclassified blue-plus points will be given higher weights)7.

Another model is created and predictions are made on the dataset.

(This model tries to correct the errors from the previous model)8.

Similarly, multiple models are created, each correcting the errors of the previous model.

9.

The final model (strong learner) is the weighted mean of all the models (weak learners).

Thus, the boosting algorithm combines a number of weak learners to form a strong learner.

The individual models would not perform well on the entire dataset, but they work well for some part of the dataset.

Thus, each model actually boosts the performance of the ensemble.

Which is the best, Bagging or Boosting?There’s not an outright winner; it depends on the data, the simulation and the circumstances.

Bagging and Boosting decrease the variance of your single estimate as they combine several estimates from different models.

So the result may be a model with higher stability.

If the problem is that the single model gets a very low performance, Bagging will rarely get a better bias.

However, Boosting could generate a combined model with lower errors as it optimises the advantages and reduces the pitfalls of the single model.

By contrast, if the difficulty of the single model is over-fitting, then Bagging is the best option.

Boosting for its part doesn’t help to avoid over-fitting; in fact, this technique is faced with this problem itself.

For this reason, Bagging is effective more often than Boosting.

End NotesEnsemble modelling can exponentially boost the performance of your model and can sometimes be the deciding factor between first place and second in many competitions!.In this article, we covered two important ensemble learning techniques.

I hope this article would have given you a solid understanding of this topic.

If you have any suggestions or questions, do share in the comment section below.

.

. More details

Leave a Reply