Data Integration - Methods / Design Pattern

About

With multiple applications in your IT infrastructure reading and writing to and from different data stores in varying formats, it is imperative to implement a process that will let you integrate the data so that they can be easily used by anyone in your company.

See also:

List

View

  • Easiest one
  • Largest support
  • Possible performance issues
  • Strong Consistency
  • One database must be reachable by the other
  • DBLink
  • Updatable (?)

Materialized View

  • Better performance
  • Strong or Eventual Consistency
  • One database must be reachable by the other
  • DBLink
  • Updatable (?)

Mirror Table using Trigger

  • Depends on Database Support
  • Strong
  • One database must be reachable by the other
  • DBLink

Mirror Table using Transactional Code

Mirror Table using Transactional Code

  • *Any* code
  • Strong Consistency
  • Stored Procedures or Distributed Transactions
  • Cohesion and coupling issues
  • Updatable (?)

Mirror Table using ETL tools

ETL, (Batch Select)

  • Lots of available tools
  • Requires external trigger (usually time-based)
  • Can aggregate from multiple datasources
  • Read Only

Event Sourcing (Stream)

Event Sourcing: one of the hardest one

  • State of data is a stream of events
  • Eases auditing
  • Eventual Consistency
  • Distributable stream through a Message Bus

Example: Change Data Capture

Immutable append only log + materialized view

Known also as:

  • lambda/kappa architecture;
  • database inside-out/unbundled;
  • state machine replication;
  • etc

See Martin Kleppmann + Neil Conway

Documentation / Reference


Powered by ComboStrap