Bootstrap aggregating (bagging) is a machine learning ensemble meta-algorithm designed to improve the stability and accuracy of machine learning algorithms used in statistical classification and regression.
- produces several different training sets of the same size with replacement
- and then build a model for each one using the same machine learning scheme
- Combine predictions by voting for a nominal target or averaging for a numeric target
Bagging can be parallelized.
Advantage / Inconvenient
It's very suitable for “unstable” learning schemes which means that small change in training data can make big change in the model.
In bagging, you sample the set “with replacement” which means that you might get in your sample two of the same instance.