Machine learning - Bootstrap aggregating (bagging)
Table of Contents
1 - About
Bootstrap aggregating (bagging) is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.
It also reduces variance and helps to avoid over-fitting. Although it is usually applied to decision tree methods, it can be used with any type of method.
Bagging:
- produces several different training sets of the same size with replacement
- and then build a model for each one using the same machine learning scheme
- Combine predictions by voting for a nominal target or averaging for a numeric target
Bagging can be parallelized.
2 - Articles Related
3 - Advantage / Inconvenient
It's very suitable for “unstable” learning schemes which means that small change in training data can make big change in the model.
Example: decision trees is a very unstable schema but not Naïve Bayes or instance‐based learning because all attributes contributes independently
4 - Replacement
In bagging, you sample the set “with replacement” which means that you might get in your sample two of the same instance.
5 - Implementation
5.1 - Weka
meta>Bagging