Software Design - Recovery (Restartable)

Card Puncher Data Processing


In really big system, there is always something that will go wrong. And it’s not possible to master all the different scenario that will arise. The file is delivered a little bit later, a mapping is not started every day, a new process is added, the file system gives a bad block and I don’t talk about the network… In other words, that’s life. One of the answer of this kind of situation, is to embrace this fact of life and to make every mapping recoverable / restartable (ie with little or no dependency). The visualization of dependencies is generally made through what it’s called a death star. Not easy to do but really nice to have.

See also:

Discover More
Data System Architecture
Data Management - (Transaction|Request|Commit|Redo) Log

(Transaction|Request|commit) logs are structured log file store all changes made to the data as they occur. They permits the implementation of : transaction isolation undoable operation. recovery...
Data System Architecture
Data Warehousing - 34 Kimball Subsytems

This page takes back the Kimball Datawarehouse 34 Subsystem as a table of content and links them to a page on this website....
Kafka Commit Log Messaging Process
Kafka (Event Hub)

Apache Kafka is a broker application that stores the message as a distributed commit log. The entire data storage system is just a transaction log. |data feeds Data Systems are exposing data, ...
Map Reduce One Picture
Map Reduce (MR) Framework

Map reduce is a distributed execution . The MapReduce programming model (and a corresponding system) was proposed in a 2004 paper from a team at Google as a simpler abstraction for processing very large...
Card Puncher Data Processing
Software Design - (Fault Tolerance|Resilience)

Fault tolerance (or resilience) is the ability to recover from errors (fault), regardless of whether those errors resulted from: hardware issues, software issues, general systems issues (network...
Card Puncher Data Processing
Software Design - Idempotence (Idempotent)

Idempotence is the ability to apply multiple times an operation without changing the result beyond the initial application. Given the same input, re-executing a task will always produce the same result...
Card Puncher Data Processing
Software Development - (Stateless|Stateful)

Stateless or state-full refers to the fact that a unit of program (process, function, procedure) have a state or not (Ie variable that may change). stateless Parallel aggregate operations over...
Event Centric Thinking
Stream - Samza

LinkedIn stream processing framework that provides powerful, reliable tools for working with data in Kafka. (LinkedIn created Apache Kafka to be the data exchange backbone of its organisation.) See StreamTask...
Data System Architecture
Transactions - Rollback Journal (Undo journal)

A rollback journal consists of records of the actions of transactions, primarily before they are committed. Its name comes from the fact that its primary function is to roll back (undo) changes from...

Share this page:
Follow us:
Task Runner