About
Data Mining can be defined as the automatic or semiautomatic task of extracting previously unknown information from a large quantity of data.
Data mining try to discover in data unknown:
- unexpected patterns
- and relationships.
Data mining is becoming an increasingly important tool to transform data into a wide range of profiling practices, such as:
- marketing,
- surveillance,
- and scientific discovery.
Data Mining first evolved for use in marketing: by understanding the relationships between customers and actions, better marketing can be developed.
Data mining is also known as:
- Knowledge Discovery in Data (KDD) - especially in academic
- Statistical Learning
Machine Learning are the data mining algorithms.
Success of method depends on the domain as Data mining is an experimental science.
Data mining can answer questions that cannot be addressed through simple query and reporting techniques.
Data mining is the practice of automatically searching large stores of data to discover:
- patterns
- and trends
that go beyond simple analysis.
Data mining uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events.
The key properties of data mining are:
- Automatic discovery of patterns
- Prediction of likely outcomes with the creation of actionable information
- Description of the data (grouping)
- Focus on large data sets and databases
Data mining can be used to solve many kinds of business problems, including:
- Predict individual behaviour, for example, the customers likely to respond to a promotional offer or the customers likely to buy a specific product (Classification)
- Find profiles of targeted people or items (Classification using Decision Trees)
- Find natural segments or clusters (Clustering)
- Identify factors more associated with a target attribute (Attribute Importance)
- Find co-occurring events or purchases (Associations, sometimes known as Market Basket Analysis)
- Find fraudulent or rare events (Anomaly Detection)