This articles tries to list the differences between the statistics fields.
The best one would be to consider Machine Learning and Data Mining as applied statistics.
Statistics vs Machine Learning
Leo Breiman, Statistical Modeling: The Two Cultures, Statistical Science 16(3), 2001
- One View:
- Find a function that predicts y from x: no model of nature implied or needed (decision tree, neutral net)
Data mining vs Machine learning
Data mining is the application machine learning are the algorithm.
machine learning has the upper hand in Marketing.
Validation of the model
Traditional statistical methods, in general, require a great deal of user interaction in order to validate the correctness of a model. As a result, statistical methods can be difficult to automate.
Large Data set
Moreover, statistical methods typically do not scale well to very large data sets. Statistical methods rely on testing hypotheses or finding correlations based on smaller, representative samples of a larger population. Data mining methods are suitable for large data sets and can be more readily automated. In fact, data mining algorithms often require large data sets for the creation of quality models.