Table of Contents

Data Processing - Replication

About

Replication: Having a copy of the same data on multiple machines (nodes) in order to increase :

Feature Example
Performance serve reads in parallel, distributing application workloads across multiple databases
Availability keep the systeem running if a machine stops working due to outage, upgrade or maintenance, fault tolerance

See also: Data Integration - Synchronization

Architecture

Common replication concepts include:

Leader based

Replication at each master and subscriber database is controlled by replication agents that communicate through TCP/IP stream sockets. The replication agent on the master database reads the records from the transaction log for the master database. It forwards changes to replicated elements to the replication agent on the subscriber database. The replication agent on the subscriber then applies the updates to its database. If the subscriber agent is not running when the updates are forwarded by the master, the master retains the updates in its transaction log until they can be applied at the subscriber.

Replication of databases often relates closely to transactions. If a database can log its individual actions, one can create a duplicate of the data in real time. DBAs can use the duplicate to improve performance and/or the availability of the whole database system.

Database clustering

Parallel synchronous replication of databases enables the replication of transactions on multiple servers simultaneously, which provides a method for backup and security as well as data availability. This is commonly referred to as “database clustering”.

Algorithm

Documentation / Reference