Statistics - Resampling through Random Percentage Split
Table of Contents
1 - About
Percentage Split (Fixed or Holdout) is a re-sampling method that leave out random N% of the original data.
For example, you might select:
- 75% of the rows formed the training setfor building the model
- 25% of the rows formed the test set for testing the model.
The algorithm is trained against the trained data and the accuracy is calculated on the test data set.
2 - Articles Related
3 - Standard Deviation in Validation
When percentage split with a random method is repeated for validation, there is a good chance of overlap between the different test sets. The algorithm has already (learn|see) them. With cross-validation, this overlap doesn't occur. This is why the standard deviation estimate tends to be smaller for percentage split than for cross-validation.