Flyway

Goals

Concepts

Library

Dependencies

Lesson

One of the most onerous aspects of developing a database application is schema management. An application's domain model is seldom completely understood up front; it evolves over time. The domain model should drive the database schema, which means schema change is inevitable during the course of development.

A changing schema presents several complications.

Managing these aspects of incremental schema change is referred to as schema migration, and with all but the smallest projects is difficult to achieve without the help of an automated schema migration tool. Flyway is a popular, simple, and open-source migration tool for Java. It provides a library and API for automated migrations; a command-line interface; and a Maven plugin for automated migrations during builds.

Migration

Flyway schema_version table.
Flyway schema_version table. (How Flyway works)

To manage schema migrations, Flyway creates and maintains a new table in your database named schema_version which keeps track of which schema versions schema change that has been applied to the database. Developers give each incremental schema change a unique version number—even the DDL statements that create the initial database schema itself.

Initial Flyway migration of empty database.
Initial Flyway migration of empty database. (How Flyway works)

When it comes time to migrate the schema, Flyway checks the schema_version table and applies all available incremental migrations, one at a time, in order. After successful migration, Flyway updates the schema_version table with the results of the migration. Future migrations will note the current version of the schema and perform no changes if the schema is already at the latest version.

Flyway schema_version table after initial migration.
installed_rank version description type script checksum installed_by installed_on execution_time success
1 1 Initial database schema. SQL V1__Initial_database_schema.sql 1234567890 jdoe 2018-01-02 33:44:00.0 543 TRUE
2 2 Added item color. SQL V2__Added_item_color.sql 9876543210 jdoe 2018-01-02 33:55:00.0 123 TRUE

SQL-Based Migrations

Discovery of SQL-based migration files.
                           prefix separator          suffix
                                ↓ ↓                  ↓
src/main/resources/db/migration/V2__Added_item_color.sql
                                 ↑         ↑
                              version    description

The easiest and preferred way to define migrations is to provide .sql files containing the actual SQL commands needed to be executed for each migration. Usually these files contain the initial DDL statements to define the schema, later schema updates, and DML statements for bulk data updates. Flyway automatically discovers SQL-based migrations if they are placed in the src/main/resources/db/migration directory of your project and use a particular format for the filename as shown in the figure. The format of many of the filename components are configurable. The version number can contain full stop . characters, as you would expect with semantic versions, and underscore _ characters, with as many components as needed. The version is separated from the description by two underscore __ characters. You can include SQL -- comments or C-style multi-line /*…*/ comments as needed in the file.

The initial migration file usually creates the database schema. In this case, the first migration would contain the DDL statements from the earlier lesson on SQL, shown below.

src/main/resources/db/migration/V1__Initial_databse_schema.sql
-- Initial database schema.
CREATE TABLE Supplier (
  id INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name VARCHAR(9999) NOT NULL UNIQUE,
  address VARCHAR(9999) NOT NULL
);

CREATE TABLE Item (
  code VARCHAR(99) NOT NULL,
  name VARCHAR(9999) NOT NULL,
  stock INTEGER DEFAULT 0 NOT NULL,
  supplierId INTEGER NOT NULL,
  CONSTRAINT Item_pk PRIMARY KEY (code),
  CONSTRAINT Item_name_un UNIQUE (name),
  CONSTRAINT Item_supplierId_fk FOREIGN KEY (supplierId) REFERENCES Supplier(id)
);

CREATE TABLE Category (
  id INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  name VARCHAR(9999) NOT NULL,
  CONSTRAINT Category_name_un UNIQUE (name)
);

CREATE TABLE ItemCategory (
  itemCode VARCHAR(99) NOT NULL,
  categoryId INTEGER NOT NULL,
  CONSTRAINT ItemCategory_pk PRIMARY KEY (itemCode, categoryId),
  CONSTRAINT ItemCategory_itemCode_fk FOREIGN KEY (itemCode) REFERENCES Item(code),
  CONSTRAINT ItemCategory_categoryId_fk FOREIGN KEY (categoryId) REFERENCES Category(id)
);

Later as your application evolves, you may decide that you need to support item color as discussed in the lesson on relational algebra. Once you add an SQL DDL file with V2__ in the filename, Flyway will discover and perform this migration to a database that is at a previous migration version.

src/main/resources/db/migration/V2__Added_item_color.sql
-- Add color column to Item table.
ALTER TABLE Item
  ADD color VARCHAR(9999) DEFAULT NULL;

Java-Based Migrations

While you should prefer SQL-based migrations, some DDL or DML manipulations are so intricate that they are better performed programmatically using JDBC. The migration might also involve access to other data or configurations only available via Java. You can have SQL-based and Java-based migrations interspersed, as long as their version numbers are in the correct sequence.

Discovery of Java-based migration classes.
                      prefix separator              suffix
                           ↓ ↓                      ↓
src/main/java/db/migration/V3__Added_supplier_uuids.java
                            ↑          ↑
                         version     description

For Java-based migrations, you will create a .java source code file rather than an .sql file, placing it in the src/main/java/ directory tree. The migration logic must be in a Java class in the db.migration package and must implement org.flywaydb.core.api.migration.jdbc.JdbcMigration. Flyway recommends that you extend org.flywaydb.core.api.migration.jdbc.BaseJdbcMigration rather than implementing JdbcMigration directly.

The class must use a particular format for its name as shown in the figure. The format of many of the class name components are configurable. The version number cannot contain full stop . characters, as they are not allowed in Java class names; you can use underscore _ characters instead. The version is separated from the description by two underscore __ characters.

Place migration logic inside your implementation of the JdbcMigration.migrate(Connection connection) method. During migration Flyway will automatically call this method with java.sql.Connection instance that is already connected to the database. You should not close the connection when you are finished, but any statements you create inside the method must be closed as normal.

Imagine that you wish to add a UUID in string form to each supplier. (On products that support it, such as PostgreSQL, you would want to use an actual UUID type.) You would first need to add a new uuid column, which is easily done in SQL. You would also want to add not-null constraint and a unique constraint, both of which are also easily done in SQL. The complication is that, after adding the column but before adding the constraints, you would need to generate UUIDs in the column for the existing rows. As standard SQL provides no UUID functionality (and UUID support across products is inconsistent at best), this is more easily accomplished using the Java java.util.UUID class.

src/main/java/db/migration/V3__Added_supplier_uuids.java
package db.migration;

import java.sql.*;
import java.util.UUID;
import org.flywaydb.core.api.migration.jdbc.JdbcMigration;

/**
 * Adds and populates a <code>Supplier</code> <code>uuid</code> column.
 * @author Garret Wilson
 */
public class V3__Added_supplier_uuids extends BaseJdbcMigration {

  public static final String ADD_SUPPLIER_UUID_COLUMN_SQL = "ALTER TABLE Supplier"
    + "  ADD uuid CHAR(36)";
  public static final String LIST_SUPPLIER_UUIDS_SQL = "SELECT uuid FROM Supplier";
  public static final String LIST_SUPPLIER_UUIDS_SQL_UUID_COLUMN = "uuid";
  public static final String ADD_SUPPLIER_UUID_UNIQUE_CONSTRAINT_SQL = "ALTER TABLE Supplier
    + "  ADD CONSTRAINT Supplier_uuid_un UNIQUE (uuid)"
    + ");

  @Override
  public void migrate(final Connection connection) throws Exception {

    //add Supplier.uuid column
    try(final PreparedStatement preparedStatement = connection.prepareStatement(ADD_SUPPLIER_UUID_COLUMN_SQL)) {
      preparedStatement.executeUpdate();
    }

    //generate a different UUID for each Supplier.uuid
    try(final PreparedStatement preparedStatement = connection.prepareStatement(LIST_SUPPLIER_UUIDS_SQL,
        ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE)) {
      try(final ResultSet resultSet = statement.executeQuery()) {
        while(resultSet.next()) {
          final UUID uuid = UUID.randomUUID();
          resultSet.updateString(LIST_SUPPLIER_UUIDS_SQL_UUID_COLUMN, uuid.toString());
          resultSet.updateRow();
        }
      }
    }

    //add Supplier.uuid unique constraint
    try(final PreparedStatement preparedStatement = connection.prepareStatement(ADD_SUPPLIER_UUID_UNIQUE_CONSTRAINT_SQL)) {
      preparedStatement.executeUpdate();
    }
  }
}

Integration

There are several ways to invoke Flyway migration. Although Flyway provides a command-line tool for isolated manual migrations, it is most useful to integrate Flyway into the build process itself so that migrations happen automatically on a coordinated bases during development.

Maven

The Flyway Maven Plugin provides access to Flyway migration via the Maven POM using the org.flywaydb:flyway-maven-plugin dependency in the <plugins> section. The migrate goal by default is bound to the Maven pre-integration-test phase, so you can be assured that migration occurs after the project is successfully built and packaged, but before integration tests begin. The simple configuration below will automatically discover migration scripts and migrate the schema for the database indicated by <url>. You can also migrate manually using mvn flyway:migrate, even if you don't bind the plugin to any goal in the <executions> section.

Using the Flyway Maven Plugin to integrate automatic migration into the build..
<project …>
  …
  <build>
    <plugins>
      …
      <plugin>
        <groupId>org.flywaydb</groupId>
        <artifactId>flyway-maven-plugin</artifactId>
        <version>5.0.7</version>
        <executions>
          <execution>
            <goals>
              <goal>migrate</goal>
            </goals>
          </execution>
        </executions>
        <configuration>
          <serverId>warehouse-db</serverId>
          <url>jdbc:postgresql://example.com/warehouse</url>
        </configuration>
      </plugin>
…

The Maven plugin has several other goals to help you manage your schema in addition to actual migration. The goals that have default phases are bound to the pre-integration-test phase. See the documentation for each goal to find out the available plugin settings. The Summary section has a convenient table with more details on each goal.

migrate
Migrates the database.
clean
Drops all objects from the schema.
info
Provides information about migration history and status.
validate
Validates the migrations against those available.
baseline
Baselines the database to a specific version by setting the database to the specified baselineVersion and running migrations only after this version.
repair
Repairs the Flyway metadata table schema_version, removing failed migrations and updating the checksums to match those of the currently available migrations.

Java

If you want more control over migration, and/or you want to integrate the migration process into the application logic itself, you can directly access the Flyway Java API by including the org.flywaydb:flyway-core dependency. The main class for accessing Flyway is org.flywaydb.core.Flyway. Once you create and configure a Flyway instance, you can invoke the equivalent of the Maven plugin commands by calling methods such as Flyway.migrate(). Java integration via the API still requires that schema migration files be configured in the project as explained above.

Most importantly you will need to specify the database for Flyway to use. If you are using a javax.sql.DataSource, you can indicate this by calling Flyway.setDataSource(DataSource dataSource). Otherwise if you have a URL connection string, you can call Flyway.setDataSource(String url, String user, String password, String... initSqls).

Programmatically migrating a database using the Flyway API.
final String CONNECTION_URL = "jdbc:postgresql://example.amazonaws.com:5432/warehouse";
final Flyway flyway = new Flyway();
flyway.setDataSource(CONNECTION_URL, "jdoe", "secret"); //example only; do not include credentials in source code!
flyway.setInstalledBy(System.getProperty("user.name")); //use system user as user performing the migration
flyway.migrate();

CLI

Flyway also comes with a downloadable command-line tool which may be used to manage migrations independently from Maven. Without the benefit of Maven's dependency resolution and configuration and the lack of a POM-based configuration, this tool requires a separate configuration. Documentation can be found on the CLI tool page.

Review

Summary

Maven Plugin Goals and API Methods
Maven Goal API Method Description Default Phase
baseline Flyway.baseline() Baselines the database to a specific version by setting the database to the specified baselineVersion and running migrations only after this version. N/A
clean Flyway.clean() Drops all objects from the schema. pre-integration-test
info Flyway.info() Provides information about migration history and status. N/A
migrate Flyway.migrate() Migrates the database. pre-integration-test
repair Flyway.repair() Repairs the Flyway metadata table schema_version, removing failed migrations and updating the checksums to match those of the currently available migrations. N/A
validate Flyway.validate() Validates the migrations against those available. pre-integration-test

Gotchas

In the Real World

Baselining Schemas

Normally you would never need to baseline a database if you used Flyway to manage your schema from the start, including the initial DDL statements. In the real world you may be presented with a schema that already exists and which your tour DBA has been migrating manually. You will likely want to create Flyway migrations containing the table creation DDL along with historical schema changes, perhaps to use with an in-memory database for testing, but the DBA has already performed those changes manually on the production database—the earlier Flyway migrations have effectively already been performed. Baselining thus becomes a valuable tool for the initial switchover to Flyway, to indicate which migrations have effectively already been applied to the database in the past, and to only apply new Flyway migrations going forward.

Changing Keys and Inter-Table Migration

Correlated subqueries are particularly useful when a schema migration moves or copies information from one table to another, allowing you to ensure that the data is assigned to the appropriate rows. The V3__Added_supplier_uuids.java migration above added a UUID column named uuid to the Supplier table, producing Supplier(id, name, address, uuid). The items are still related to the suppliers using the generated ID: Item(code, name, stock, supplierId, color). Now that each supplier has a UUID, there is no need to maintain a surrogate key generated by the database. You can create an SQL migration that will add a new supplierUuid column to the Item table, copy over the correct UUIds, and then switch to using Supplier.uuid as the new primary key, with Item.supplierUuid as the foreign key.

src/main/resources/db/migration/V4__Make_Supplier_uuid_primary_key.sql
-- Item
-- Add supplierUuid column to Item table.
ALTER TABLE Item
  ADD supplierUuid CHAR(36);
-- Copy Supplier.uuid into new Item.supplierUuid column.
UPDATE Item
  SET Item.supplierUuid =
    (SELECT uuid from Supplier
      WHERE Item.supplierId = Supplier.id);
-- Drop the old Item.supplierId foreign key column.
ALTER TABLE Item DROP COLUMN supplierId;

-- Supplier
-- Drop the old Supplier.id column.
ALTER TABLE Supplier DROP COLUMN id;
-- Switch to using Supplier.uuid as the primary key.
ALTER TABLE Supplier ADD CONSTRAINT Supplier_uuid_pk PRIMARY KEY (uuid);

-- Item
-- Go back and and make Item.supplierUuid a foreign key.
ALTER TABLE Item ADD CONSTRAINT Item_supplierUuid_fk FOREIGN KEY (supplierUuid) REFERENCES Supplier(uuid);

Think About It

Self Evaluation

Task

  1. Put your current database schema migration under control of Flyway.
    • Create all the necessary migrations to create the schema from scratch using DDL statements, as well as to migrate the schema to its current state.
    • Use Flyway Java integration to turn on Flyway migration for your in-memory database tests.
    • Add Flyway Maven integration to update the production database schema as part of the build. As you've already created and updated the production database schema manually, you'll first need to baseline the schema to the appropriate Flyway migration version.
    • Do not add populate books and publishers as part of your migrations. You should be able to add data separately as part of the load-snapshot command.
  2. Make it possible for each publisher to have multiple web sites.
    • Create an additional Flyway SQL-based migration to add the appropriate table and copy over existing data as needed.
    • Make sure you remove any unnecessary table columns, as well as update your Java code and tests.
    • Find a publisher with multiple web sites and add it those web sites your snapshot data. Try to find a publisher that already exists as part of your snapshot data.
    • Load a publisher with multiple web sites into your production database, either manually or via load-snapshot.
  3. Switch to using a surrogate UUID key to identify publishers in the database, to allow for multiple publishers with the same name and to support changing a publisher's name in the future.
    • Choose the appropriate column type for the UUID and be prepared to explain your decision.
    • Create an additional Flyway Java-based migration to generate the UUIDs.
    • The schema changes must be part of the same migration so they will be performed as one atomic change.

See Also

References

Resources

Acknowledgments