Data processing is a more general term for manipulating data whereas data integration is the integration of data between two systems.
Data Integration has roughly two data processing model:
|stream processing (reactive processing)|| Send a message to another process, to be handled asynchronously
Processing that executes continuously as long as data is being produced
|batch processing|| Periodically crunch a large amount of accumulated data
Processing that is executed and runs to completeness in a finite amount of time, releasing computing resources when finished
|OLAP||Changing Queries (ad-hoc)||A lot||A lot||Fixed Data|
|OLTP||Fixed Queries||Few||Few||Changing Data|
|Streaming||Fixed Queries||All||Few||Changing Data|
|Batch (Data Warehouse)||Fixed Queries||All||A lot||Fixed Data|
see also I/O - Workload (Access Pattern)
Data Processing Model / Framework
You can visualize each data transformation step in a lineage report
- The “360 degree view of the enterprise” is a commonly discussed goal that really means data integration. ??
- ETL : Extraction, Transformation and Load Software
- ELT : Extraction, Load and Transformation Software