Data Mining - Data (Preparation | Wrangling | Munging)

Thomas Bayes

About

Data for mining must exist within a single table or view. The information for each case (record) must be stored in a separate row.

Proper preparation of the data is a key factor in any data mining project.

Star Schema

Dimensioned data (for example, star schemas) are supported through nested table transformations. ???

Type Preparation

Data Cleansing

The data must be properly cleansed to eliminate inconsistencies and support the needs of the mining application.

Data Transformation

Additionally, most algorithms require some form of data transformation, such as:

DBMS_DATA_MINING_TRANSFORM is a flexible data transformation package that includes a variety of missing value and outlier treatments, as well as binning and normalization capabilities.

Data Set

The data mining development process may require several data sets.

A data set may be:

Documentation / Reference





Recommended Pages
Model Funny
Data Mining - (Function|Model)

The model is the function, equation, algorithm that predicts an outcome value from one of several predictors. During the training process, the models are build. A model uses a logic and one of several...
P Value Pipeline
Data Mining - (Life cycle|Project|Data Pipeline)

Data mining is an experimental science. Data mining reveals correlation, not causation. With good data, you will make good algorithm. The most preferable solution is then to work on good features....
Thomas Bayes
Data Mining - Result Considerations

Before tackling a data mining problem, some considerations must be take into account in order to get good interpretations of the results. Strong correlations of data do not necessarily prove a cause-and-effect...
Data Mining Tool 2
Oracle Data Mining - Data Miner GUI

Oracle Data Miner is the graphical user interface for Oracle Data Mining. Oracle Data Miner provides wizards that guide you through: the data preparation, data mining, model evaluation, and...
Data Mining Tool 2
Oracle Data Mining - PL/SQL DBMS Package

The PL/SQL interface to Oracle Data Mining is implemented in three packages: DBMS_DATA_MINING, the primary interface to Oracle Data Mining DBMS_DATA_MINING_TRANSFORM, convenience routines for data...



Share this page:
Follow us:
Task Runner