Database - Migration (Deployment|Versioning|Change Management)

Data System Architecture

About

Code Shipping - Change and Deployment Pipeline - Development Lifecycle / Workflow with SQL database.

Database migration is the process of changing the structure of the database in an other with mostly DDL statement.

The flow sometimes will block any change that risks data-loss.

See also: Data - Migration

Methods

Their is several methods:

  • the creation of script that changes specifically a part of the schema (alter table add column)
  • the creation of a ghost table created with the full new SQL definition and filled up through database log or triggers.
  • the creation of a delta patch by the analysis of two schema (through SQL grammar or specific)

Script

Type

  • Versioned script. Each script contains a delta statement, and should be executed only once in the proper sequence.
  • Repeatable script. Rerun (overwrite when digest change) are typically used for
    • (Re-)creating views/procedures/functions/packages/…
    • Bulk reference data reinserts
  • Postprocessing scripts (compile, grant script)
  • Preprocessing script (data backup)
  • Patch script (out of order versioned script, first execution next migration)
  • Test script (test data)
  • Database specific script (for a specific database)
  • Undo script (undo the versioned script with the same version.)

Order of execution

  • Pre processing
  • Patch (executing all patch scripts out of order)
  • Versioned
  • Repeatable
  • Post processing. If a post-processing script is modified, all of them are executed again.
  • Test data (if in a test environment)

see also: RoundhousE-Script-Order

Runner

  • Native (Sqlplus,cmdline,…)
  • or code (JDBC, ODBC) with Error handler to customize the behavior when errors happen

Type:

  • Dry run: run without applying the changes.
  • Validate run: Create/copy in a validate schema and apply the changes on it
  • Out of order run: apply also the script not applied with a lower version number

Features

  • Unlike typing commands into an interactive SQL window or storing SQL scripts, migration tools keep a detailed history of how your database schema evolved

Methodology

  • One project for the migrations
  • Deploy Schema & Data Change-sets
  • You can think of each migration as being a new 'version' of the database.
  • A schema starts off with nothing in it, and each migration modifies it to add or remove tables, columns, or entries.

On databases that support transactions with statements that change the schema, migrations are wrapped in a transaction. If the database does not support this then when a migration fails the parts of it that succeeded will not be rolled back. You will have to rollback the changes that were made by hand.

Script (alter)

Script

Name

Prefix of the migration script: The only benefit of a timestamp over a number is that it almost eliminates the risk of “collision” and makes unnecessary to “reserve” a number over chat before creating the migration.

Rebase

By creating a script for each modification, if you want to retrieve the state of a database for a certain version, it can add a lot of file.

  • the first create statement
  • and all alter statement.

The list of alter statement can be so long that when you want to speed up the process of building a database, you are merging them in a create statement. With an automatic schema compare, every hundred or two alter file, the initial creation script is recreated (not concatenate all the script, an actual clean creation).

Structure
  • Deploy: A deploy directory containing all deploy script. Bv: alter table myTable add column
  • Revert: A revert directory containing the revert script. Bv: alter table myTable remove column
  • Test: A test script to test the deploy and revert action: select column from my Table = Succes

Language

  • Some tools are using a generic language in order to be database agnostic but most of the DBA wants to see the SQL scripts themselves.
  • Macro-language built-in for supporting conditional-compilation in your SQL.

Idempotent

Each migration itself is written to be idempotent, e.g. “don’t try to add the column if it’s already there”. Therefore, a specific order isn't necessary needed in the database migration script.

Column

  • Add: Almost all database engines can add columns for “free” because they don't go mutate existing rows.
  • Drop: Some can drop columns for “free” too by marking the field as obsolete and only bothering to remove it if the rows are touched.
  • Rename: Don’t do that.

Step

multi-step transition

A multi-step transition is

  • 1. Add new database structure (new columns, new tables, whatever) but leave all the old structure in place
  • 2. Update all servers with code that writes in the new format but understands how to read both the new and old structures
  • 3. Migrate the data that only exists in the old structure
  • 4. Get rid of the old stuff from the database
  • 5. Get rid of the code that is responsible for reading the old format

Conceptually it's straightforward but it can take a long time in calendar days depending on the deployment schedule, it can be tough to keep track of what's been migrated, and the data migration will cause performance issues if you don't plan it properly (e.g. trying to do a migration that locks an important table).

You just have do it in a way where each individual change is backward compatible and you don't move on to the next change until the previous one is rolled out everywhere.

transaction

Most of the migration script are done in a transaction because nobody want a migration to partially succeed but some database transaction are automatically committing any change. Ex: CREATE INDEX CONCURRENTLY in Postgres databases can't run in a transaction.

Flyway do everything in a transaction but in the non transnational cases, the migration must be done manually and update Flyway's schema_version table, which is annoyingly complex (11 columns, some with no obvious purpose).

Rollback

By deploying small and often, rollback are barely needed. It’s often quicker to fix code and deploy than reverse a migration (especially across hundreds of databases). If we needed to reverse something, we could just push another migration negating whatever we did that went boom.

Why roll back when you can roll forward?

Method

Delta in release / Definition in trunk

The truth is always in the release. The trunk has always errors. Example

Error(2739,15): PL/SQL: ORA-00904: "WAT_EVER_COLUMN": invalid identifier

Tool

Framework Script Language Desc
My Batis Native Java Version management in the script name (undo management and prod script creation)
Flyway 2013 Native Java Version management in the script name
LiquiBase Xml Java Database agnostic through the use of an XML file the databaseChangeLogFile
Ghost from Github gh-ost is a triggerless online schema migration solution for MySQL using the binary log. Must see doc
DbDeploy Native script Java, C# SQL delta scripts management with versioning in the file name
DBMaintain Native Java Script with version in the file name
Roundhouse Native C#

Log

Change Log is generally saved in a table within the target database.

Example for DbDeploy:

Column Data Type Description
ChangeId INT Auto-incrementing unique ID for each change entry
Folder VARCHAR(256) Versioned folder name (Example: v1.0).
ScriptNumber SMALLINT Sequential script number within folder.
ScriptName VARCHAR(512) File name including extension.
StartDate DATETIME Date and time script started.
CompleteDate DATETIME Date and time script ended even if it failed.
AppliedBy VARCHAR(128) User account that ran the script.
ScriptStatus TINYINT 0 = Failure, 1 = Success, 2 = Problem Resolved, 3 = Started
ScriptOutput VARCHAR(MAX) Full output of the script execution.

Rule

The rules that I will search in a database migration tool:

  • No version in the file name. It means that you can apply any naming convention. Example flyway makes mandatory to have the number in the file name.
  • Version must be a metadata value in the sql file. A time for instance.
  • No rollback capability. If there is an error, stop the release, change the queries, commit them in the version repository and restart. If we needed to reverse something, we just push another migration negating whatever we did.
  • Compare the set of sql and their timestamp on the file system and in the log table before applying.

Documentation / Reference





Discover More
Card Puncher Data Processing
Code Shipping - Change and Deployment Pipeline - Development Lifecycle / Workflow

Deployment pipelines describe how a change in your application moves through your infrastructure into production. It lays out all necessary automated and manual steps. Every step that can be automated...
Data System Architecture
Data - Migration

Migrating data from an existing store to a new one. See also: The migration of data can be broken down as follows: Encapsulate access to the data in an appropriate data type. Modify the implementation...
Data System Architecture
Database Migration - Flyway

Flyway is a database migration tool. versioned (prefix= 1,2,...) repetable (prefix=R). State by script: pending, outdated, success By default, the configuration will be read...



Share this page:
Follow us:
Task Runner