About
Data for mining must exist within a single table or view. The information for each case (record) must be stored in a separate row.
Proper preparation of the data is a key factor in any data mining project.
Articles Related
Star Schema
Dimensioned data (for example, star schemas) are supported through nested table transformations. ???
Type Preparation
Data Cleansing
The data must be properly cleansed to eliminate inconsistencies and support the needs of the mining application.
Data Transformation
Additionally, most algorithms require some form of data transformation, such as:
- binning
- or normalization.
DBMS_DATA_MINING_TRANSFORM is a flexible data transformation package that includes a variety of missing value and outlier treatments, as well as binning and normalization capabilities.
Data Set
The data mining development process may require several data sets.
A data set may be:
- needed for building (training) the model;
- used for scoring.
- used for testing.