Running Postgres with Daemontools – Shutdown Errors

Running anything under daemontools seems like a great idea, but I’ve had a particularly bad time trying to shutdown postgres under daemontools while running Apache and PHP with persistent postgres connections.I had a VPS that was running postgres with daemontools and I just let it lapse because I wasn’t getting enough use out of it. Instead I focused on my main machine. When I went to upgrade Postgres I noticed that I didn’t have it running under daemontools.

I set to work getting it running. I even prepared a tutorial for you on running postgres under daemontools. I got everything working relatively easy using some of the how-to’s found on the net. Everything was working well and just as I was about finished I decided that bringing it up was good, but that I should verify shutdown.

I issued the required svc -d /service/postgres and everything went wacky. Logging in the log, every 3 minutes it kept saying “FATAL: the database system is shutting down'”. After examining ps output I found a defunct postmaster and a bunch of active connections.

After several hours of reading and trying I’m pretty sure I came up with the answer. From the postgres manual regarding pg_ctl:

In stop mode, the server that is running in the specified data directory is shut down. Three different shutdown methods can be selected with the -m option: “Smart” mode waits for all the clients to disconnect. This is the default. “Fast” mode does not wait for clients to disconnect. All active transactions are rolled back and clients are forcibly disconnected, then the server is shut down. “Immediate” mode will abort all server processes without a clean shutdown. This will lead to a recovery run on restart.

svc -d sends a TERM and CONT signal. It certainly wasn’t adding the -m option. But I did notice that the init script that ships with postgres (which was successful) performs the following on a ‘stop’ request:

echo -n "Stopping PostgreSQL: "
su - $PGUSER -c "$PGCTL stop -D '$PGDATA' -s -m fast"
echo "ok"

In other words, the -m fast switch means it doesn’t wait for clients to disconnect. This is an important point if you run PHP with persistent connections. That explains why those process entries after svc -d were persistent despite the defunct postmaster. The svc -d /service/postgres was probably waiting on those persistent connections to disconnect. That’s my guess. Svstat kept saying it was waiting to shutdown.

Thus ends my attempt to run postgres under daemontools for now. Starting it was successful, reloading confiuration files was successful, but that’s where it ends. From a guy who has had both data loss and corrupt filesystems from non-clean postgres shutdowns, I’m pretty concerned what will happen when the computer reboots if it can’t disconnect those connections.

Or for that matter, what will happen if postgres shuts down, then supervise restarts it, but those persistent connections remained open. I’m betting Apache will be showing that it is unable to connect to the database server. I’ve noticed in the past that changing a users search path in postgres on command line (through psql) and then reloading a web page has absolutely no effect because the web server is holding open a persistent connection to which those changes don’t apply.

This will take some more examination, but for now it may be better to look at another tool such as monit, or consider getting rid of persistent connections.