Per-project Postgres
Here’s a little trick I worked out a while back. Along the way I’ll introduce two tools, direnv and Nix, which I’ve found super helpful when developing many different projects on the same computer. Each of these three tricks are useful by themselves but I find they go together especially well.
When I do web development with Django, I appreciate that it’s easy to set up a local development copy of a Django site using an in-process SQLite database. That makes it quick and easy to get started, because I don’t have to have a database server running on my laptop, or log in as an administrator to create accounts and databases and set access control rules.
On the other hand, many code bases require a specific database backend. And even in projects that have a multi-database abstraction layer over SQL, testing using a different database that you use in production is a recipe for unpleasant surprises.
So I worked out how to set up a Postgres database that’s almost as convenient as SQLite. (I’ve used this same approach for Elasticsearch as well; doing the same for MySQL or other databases is left as an exercise for the reader.)
Configuring Postgres
My goal was to make starting the Postgres database server as simple as
typing postgres
and pressing “enter”. This is still slightly more
complex than SQLite, where you don’t have to start a database server at
all, but there are also some big advantages to having Postgres log
output available in a terminal during development, so I’m going to
pretend this is actually a feature.
For this to work, Postgres needs to get all of its settings either from environment variables or config files. So I wrote this shell script fragment:
# Place the data directory inside the project directory
export PGDATA="$(pwd)/postgres"
# Place Postgres' Unix socket inside the data directory
export PGHOST="$PGDATA"
if [[ ! -d "$PGDATA" ]]; then
# If the data directory doesn't exist, create an empty one, and...
initdb
# ...configure it to listen only on the Unix socket, and...
cat >> "$PGDATA/postgresql.conf" <<-EOF
listen_addresses = ''
unix_socket_directories = '$PGHOST'
EOF
# ...create a database using the name Postgres defaults to.
echo "CREATE DATABASE $USER;" | postgres --single -E postgres
fi
Other than those settings, the Postgres defaults all work out great. If you execute the above script every time you start working in a given project, you’ll get a valid Postgres data directory nested inside that project and your environment will be set up correctly for both client and server to communicate with each other.
You need the postgres server binaries (initdb
and postgres
) on your
$PATH
for this to work. Then starting the project’s database server is
just:
$ postgres
LOG: database system was shut down at 2019-05-29 19:33:03 PDT
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
And starting a database shell is equally simple:
$ psql
psql (9.6.11)
Type "help" for help.
jamey=#
Automate environment setup with direnv
Having to remember to run that script every time you open a new terminal or switch to working on a different project would be a pain, though. So I use direnv. From that project’s description:
direnv
is an environment switcher for the shell. It knows how to hook into bash, zsh, tcsh, fish shell and elvish to load or unload environment variables depending on the current directory. This allows project-specific environment variables without cluttering the~/.profile
file.
The direnv project ships a “standard library” of shortcuts for
commonly-used environment settings. You can extend it by adding
definitions to ~/.direnvrc
, which makes those definitions available to
any environments you set up. So I added this function to my
~/.direnvrc
:
layout_postgres() {
export PGDATA="$(direnv_layout_dir)/postgres"
export PGHOST="$PGDATA"
if [[ ! -d "$PGDATA" ]]; then
initdb
cat >> "$PGDATA/postgresql.conf" <<-EOF
listen_addresses = ''
unix_socket_directories = '$PGHOST'
EOF
echo "CREATE DATABASE $USER;" | postgres --single -E postgres
fi
}
This is exactly the same shell fragment as before, except for two changes:
-
I’ve wrapped the whole thing in a function called
layout_postgres
, which allows me to calllayout postgres
(note the space instead of underscore) in any.envrc
file for any of my projects. -
Instead of locating
PGDATA
inside$(pwd)
, I’ve used$(direnv_layout_dir)
, which is direnv’s conventional location for extra files and directories needed for a particular project layout. It defaults to.direnv/
inside the same directory as.envrc
.
For more ideas on how you might use direnv, check out the direnv project wiki.
Ensure needed programs are available
In most operating systems, if you install the Postgres server software, your OS vendor will assume you want to be running a systemwide database instance. For local development, though, I want to be sure I do not have a systemwide database running, especially not one that could be listening for network connections from any random person in whatever coffee shop I happen to be visiting.
In fact, for this purpose, we shouldn’t need administrative privileges at all. You could download a copy of the Postgres source code, compile it in your home directory, and run it from there, if you wanted. That’s tedious, though.
What I do instead is I use the Nix package manager to set up an environment which contains exactly the software I want, possibly by building it from source for me if it doesn’t have a pre-built binary available already. Among other things this means I can easily have different versions of software installed in different project environments, without any conflicts.
(I believe you can use GNU Guix to do exactly the same things. In the following, every time I say “Nix”, you may assume I added “or probably Guix, but I haven’t tried it, so your mileage may vary.”)
Conveniently, the direnv standard library includes a shortcut for adding
Nix packages to an environment. So my .envrc
for a project which needs
a Postgres database can be as simple as this:
use nix -p postgresql_9_6
layout postgres
I’ve only tested this with Postgres 9.6, so I requested that version specifically here, but the Nix packages collection includes newer versions as well. If you need to test one project against an older version, those are available too.
Summary
Postgres makes it straightforward to set up per-project data directories which keep all the mess contained within the project itself. Unfortunately, the usual installation procedures obscure this by assuming that you’re setting up a production server. As far as I know, the approach I’ve outlined here should work, more or less, on every platform Postgres supports; except that I assume on Windows it does not support Unix sockets, so you’d need to configure a unique TCP port for each data directory instead.
I’ve found direnv to be extremely convenient for automating all sorts of environment setup tasks, and would recommend it to developers on any platform.
In my experience direnv becomes even more useful in conjunction with the
Nix package manager because then no project dependencies need to be
installed system-wide, and different projects can use different versions
without conflicts. Several programming languages have their own
language-specific mechanisms for doing this, such as Python’s
virtualenv
or pipenv
tools, but direnv plus Nix extends that to all
the software needed for a project, regardless of which language it was
written in. However, Nix is best tested on Linux, is supported on macOS,
and is questionable on anything else; so if you need to use, say, native
Windows software, then this might not be the best tool for you.
I hope you find one or more of these tricks useful!