Following up on Running Postgres with Daemontools – Shutdown Errors, the Postgres mail list had some advice on monitoring and automatically restarting Postgres.
It seems the danger in automatically restarting the postmaster process from Postgres is a dangerous proposition. As mentioned in the mail list, if a process quits for whatever reason, daemontools will simply restart it. Now, that could theoretically fail and restart 3600 times per hour, every hour, every day without you ever being notified that there is a problem that requires attention. This may be fine for some things, but not Postgres.
Some people indicated they use a custom script to try and connect to postgres, and if the postgres fails they are notified. Some said the program monit will do the same job. I believe monit is quite extensible and will allow you to be notified as well as restart the server.
Still another person said they run the nagios monitoring program and are notified by email and pager of a failure.
Whatever the solution is, it seems only right to me that you should be notified of a failure if you so desire. At the same time, the mechanism that is handling the process should provide a clean shutdown. Apparently monit does that.
While I can’t advocate daemontools for Postgres, I would certainly say it is time to revisit nagios and monit. I’ve used nagio is the past but had resource issues with it. I’ve perused the monit documentation and enjoyed the flexibility. Now I’ll be looking at them both again and reporting back.