Data Processing - (Batch|Bulk) Processing


An batch processing systems (bulk,offline) means:

  • starting a process,
  • reading a lot of data in batch (in parallel if possible)
  • and terminating the process



Simple code iterates generally one tuple at a time (for example looping over rows in a table). This kind of algorithms are hard to optimize and parallelize compared to declarative set-oriented languages such as SQL.

Batch vs Stream processing

Powered by ComboStrap