There are a ton of database migration tools out there, and they all have a similar set of features. A lot of times the decision on what migration tooling to use is decided by the ORM or database driver you’re using.
For a while now I’ve wanted to dig into what a bare minimum migration tool that uses standard Unix tools and
psql, the PostgreSQL CLI, would look like. My goal here is to have something that should work on any general Unix system without having to lean on my ORM or other tools to use it.
Run scripts with psql
To get started, connect to the database with
psql. There are several ways to set connection details, but I like to use use a connection string:
The rest of this post will take for granted that
$DATABASE_URL is set.
Once successfully connected, the
psql command presents a REPL that allows query input and other
psql specific commands. Hitting
Ctrl-d will exit
Beyond the REPL mode,
psql allows a
-f [filename] argument to be passed, which causes it to run the given file.
This is great for the goal of running migrations, because it means a migration in a
.sql file composed of a sequence of SQL queries can be run with
psql with the
There are a few other options on the
psql command that look handy. Giving
-v ON_ERROR_STOP=1 option, which sets the
ON_ERROR_STOP variable to
1 does exactly what it looks like. If an error occurs in a migration, it will return an error status code.
-1 looks helpful. It will run an entire
-f script in a single transaction.
With all of those commands combined, here’s the final
psql migration argument set:
It’s a start!
Migration State Management
Most migration systems track state in a table within the database. The table tracks what migrations have and haven’t been run, and usually some extra metadata like when they were run.
Here’s a table creation query for a basic migration script:
This table has two columns. An
id column that tracks the version for a recorded migration, and
migrated_at that records the time the migration occurred.
To record to the migrations table, a query like this needs to be run:
It’d be a little annoying to write that each time, though, so why not define a function within PostgreSQL for inserting a migration version:
This takes a migration id and records it to the
migrations table. It can be run like this:
Because the migrations table exists and typically when a new migration is run it’s expected to be working on an existing state of the database, it can also be helpful to have a function that will assert that the database state is as expected.
This one’s a little more complicated:
- It takes in an expected migration version as an argument.
- Queries for the latest migration version in the
migrationstable, and set it to
- Compare the expected migration version to
latest_id. If they aren’t equal, raise an error.
This function can be used like this:
If the latest migration in the
migrations table doesn’t match the argument, which in the example above is
1, an error will be raised. This fits nicely with the
-v ON_ERROR_STOP=1 option in
Putting all of this together, the first migration file is ready to go:
Running this with the
psql command above will apply it to the database:
Adding a Real Migration
Running a migration to set up migrations isn’t that satisfying. Here’s a first real migration, which creates a table of users:
This uses the
assert_latest_migration function from above to ensure the database state starts out as expected. It then creates the
users table, and closes by using the
log_migration function to record the new migration version.
Run it with
psql -1 -v ON_ERROR_STOP=1 -f migrations/001-create-users.sql
This minimal migration system seems to be working!
Bulk Executing Migrations
There’s a great way to run a single migration, but what about all migrations? There are great Unix command line tools that can help piece together running all migrations.
find will help get a list of all migrations in a directory:
This finds all files with a
.sql extension in the migrations directory. Find doesn’t sort that list, though, so pipe the output from find into
This outputs a list of all the filenames sorted. Since
psql needs a
-f in front of each filename,
printf along with
xargs can set that up:
And then another
xargs runs the migrations:
This will run all migrations and exit if any of them fail. They’ll be run in a single transaction.
It’s probably not practical to remember and type these
psql based commands out, so some simple
bash scripts can help give quick commands for running a single migration or running all migrations:
Running this script will make sure the given argument is a file, and if not it’ll give a usage message. Otherwise it’ll run the migration script.
This script expects a directory as an argument, and runs all of the migrations in the given directory, in order.
There’s a problem with this script, though, that will take a while to hit. There is a limit to how long an argument can be, and
xargs is simply concatenating all files into a huge list. There are limits to how long this argument list can be, though. Run
getconf ARG_MAX to see the maximum length of a command. On my computer right now it’s
1048576. It’s long, but not infinite.
It’s ideal to just use a loop in bash instead:
bin/migrate-all (take 2):
And with that, the minimal migration solution is ready to use.
This isn’t the most full featured migration system.
Run only missing migrations
This doesn’t currently support running the latest migrations. In practice I don’t generally miss that feature, because I prefer to drop my database and rebuild it in development. During deployment it’s good to take a minute and think through the migrations that need to be run and review them, rather than just running a migrate command.
It’s certainly missing any way to roll back migrations. My general experience is that I seldom to never migrate back, and that frequently migrations aren’t able to be undone, so this doesn’t bother me too much, but of course your team’s mileage may vary there.
If you’d like to see the full example code from this, check out nalanj/migrations. Feel free to throw an issue or a discussion into that repo.