data_mining:boundary

About

Classifiers create boundaries in instance space. Different classifiers have different biases. You can explore them by visualizing the classification boundaries.

In Weka, the visualization is restricted to numeric attributes, and 2D plots

Articles Related

Example

Logistic Regression method produces linear boundary with a gradual transition from one color to another. Logistic regression is a sophisticated way of choosing a linear decision boundary for classification.
Support Vector Machine method: The resulting plot have no areas of pure color
Random Forest method, The boundary shapes has a checkered pattern with slightly fuzzy boundaries

Algorithm	Boundary Shapes
Logistic Regression	Strictly Linear
Knn	piecewise linear
support vector machine	piecewise linear
Decision tree	definitely non-linear

knn decision boundary in any localized region of instance space is linear, determined by the nearest neighbors of the various classes in that region. But the neighbors change when you move around instance space, so the boundary is a set of linear segments that join together.

Support vector machines also produce piecewise linear boundaries.

c4.5 produces decision trees, which create non-linear boundaries in instance space.

Logistic regression is a sophisticated way of producing a good linear decision boundary, which is necessarily simple and therefore less likely to overfit.

support vector machine produce piecewise linear boundaries, but is resilient against overfitting because it relies on a small number of support vectors.

The Logistic classifier (and also meta.ClassificationViaRegression) calculates a linear decision boundary.

The boosting algorithm AdaBoostM1 has a checkered pattern with crisp boundaries

Documentation / Reference

4.1 Weka Classification boundaries

Table of Contents

Data Mining - Decision boundary Visualization

About

Articles Related

Example

Documentation / Reference