One Rule is an simple method based on a 1‐level decision tree described in 1993 by Rob Holte, Alberta, Canada.
Simple rules often outperformed far more complex methods because some datasets are :
For each attribute,
For each value of the attribute,
make a rule as follows:
count how often each class appears
find the most frequent class
make the rule assign that class to this attribute-value
Calculate the error rate of this attribute’s rules
Choose the attribute with the smallest error rate
Example of output for the weather data set
outlook:
if sunny -> no
if overcast -> yes
if rainy -> yes
with this one-level decision tree, 10 instances are correct on 14.
Algorithm to choose the best rule
For each attribute:
For each value of that attribute, create a rule:
1. count how often each class appears
2. find the most frequent class, c
3. make a rule "if attribute=value then class=c"
Calculate the error rate of this rule
Pick the attribute whose rules produce the lowest error rate
OneR always outperforms (or, at worst, equals) Baseline when evaluated on the training data. (evaluating on the training data doesn't reflect performance on independent test data.)
ZeroR sometimes outperforms OneR if the target distribution is skewed or limited data is available, predicting the majority class can yield better results than basing a rule on a single attribute. This happens with the nominal weather dataset
The “minBucket size” parameter of weka limits the complexity of rules in order to avoid overfitting (Default 6)
With one “minBucket size” the accuracy on the training data set is really high and decreases whereas the “minBucket size parameter” increases.
The cross validation evaluation method (10 folders) limits the accuracy effect and make it more stable through the “minBucket size” values.
min Bucket Size Parameter | Eval Method: Cross Valid- ation Accuracy | Eval Method: Training Set Accuracy | Number of conditions generated |
---|---|---|---|
1 | 47.66 | 92.99 | 106 |
2 | 48.13 | 71.5 | 31 |
3 | 59.81 | 68.22 | 14 |
4 | 59.35 | 66.36 | 10 |
5 | 57.94 | 63.55 | 8 |
6 | 57.94 | 63.08 | 8 |
7 | 58.41 | 62.14 | 6 |
8 | 56.07 | 61.68 | 6 |
9 | 57.48 | 60.75 | 4 |
10 | 57.94 | 59.34 | 4 |