There are three places to measure and document data quality:
- Raw incoming data in the staging area
- After initial cleansing
- After final cleansing
Conditional Loading in a process flow
You can design process flows that proceed based on the results of profiling data.
For example, the figure below displays a process flow that contains a Data Auditor Monitor activity. In this process flow, LOAD_EMP_MAP is a mapping that loads data into the EMP table. If the data load is successful, the data auditor EMP_DATA_AUDIT is run. The data auditor monitors the data in the EMP table based on the data rules defined for the table.
This decision point is something you should value. Rather than the above where you load bad data, now need to get it back out, you make an informed decision whether or not to load the data.