Table of Contents
Data Processing - Architecture
Requirement
Fault tolerance
Parallelism
High Latency
Delivery semantics
Operations and monitoring
Schema management
Characteristic
forward-compatible data architecture: the ability to add more applications that need to process the same data … differently
List
Data Processing - Lambda Architecture (batch and stream processing)
Data Warehouse - Layer (Architecture)
DataFlow Model
Map Reduce (MR) Framework
Documentation / Reference
Data on the Outside vs. Data on the Inside - Data kept outside SQL has different characteristics from data kept inside. - Pat Helland