Data Mining - Dimensionality (number of variable, parameter) (P)

Thomas Bayes


Not to confound with d: the model size. You may have 1000 attributes (p=1000) in your sample but after feature selection for instance, you model may use only a handful (d=5)

In physics and mathematics, the dimension of a mathematical space (or object) is informally defined as the minimum number of coordinates needed to specify any point within it. (ie the number of variable to to be able to define an outcome)

In high dimension, it's really difficult to stay local. See Data Mining - High Dimension (Curse of Dimensionality)

