Tag Archives: upgrading

Upgrading PostgreSQL to a new major version

Upgrading PostgreSQL to a major version can be a challenge but isn’t difficult. You need to plan ahead and do some testing afterward, but PostgreSQL includes everything you need to make the change almost seamlessly. I give you a step by step rundown of how to make the upgrade and gain all the benefits that come with a new Postgres version.

Upgrading PostgreSQL

Upgrading PostgreSQL need not be hard. The creators have provided all facilities you need to upgrade nearly painlessly, and depending on the extent of your use, nearly seamlessly.

These notes were made when I was upgrading from PostgreSQL 8.2.6 to 8.3.0, but the information should apply equally to other versions. Keep in mind the version number are specific to these releases, and to substitute your own accordingly.

According to the PostgreSQL INSTALL doc, here’s their order of install (not upgrade), but the procedure is almost identical with the addition of backing up and restoring your old data.

./configure
gmake
su
gmake install
adduser postgres *for new install only
mkdir /usr/local/pgsql/data
chown postgres /usr/local/pgsql/data
su - postgres
/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data >logfile 2>&1 &
/usr/local/pgsql/bin/createdb test
/usr/local/pgsql/bin/psql test

configure

First, run configure. I changed the install path to reflect the version so that I can run the two servers in parallel.
./configure --prefix=/usr/local/postgres-8.3.0

gmake

Watch for errors. Run gmake, not make. Most systems have gmake as a symlink to a capable compiler. Pretty straight forward. The email list is there to help if you have problems with compilation, which is usually rare. Googling for particular errors is also helpful, and will reveal a fix if you’ve encountered a bug or platform specific problem.

gmake install OR gmake install-strip

If you are not using the --enable-debug option then consider using gmake install-strip when installing. This should reduce the file size of the binary files and libraries as well as save some memory while loading. I use this on any non-development machine.

Make the /data directory

Straight forward here. Create the directory and initialize the new database cluster.

mkdir /usr/local/postgres-8.3.0/data
chown postgres /usr/local/postgres-8.3.0/data
su - postgres
/usr/local/postgres-8.3.0/bin/initdb -D /usr/local/postgres-8.3.0/data

While running initdb you’ll see:

The default text search configuration will be set to "english".
creating dictionaries ... ok

This is evidence of the brand new tsearch which is integrated into the core. Many people had upgrade problems with tsearch in the past, and for me tsearch is an important tool, especially needed for those people migrating from MSSQL, so it is a great step forward to see this included in the core now. Please see below if you were using the old tsearch contrib module.

Compile contrib modules

After you perform the install I would say pause here. Take a look at the contrib directory in the source. Are you using any contributed modules? If so, I would say this is a good place to compile and install them, before you go restoring your old database. If you don’t do this, and the contrib modules had you insert some SQL into your database, when you restore your dump the SQL will be present, but the module won’t, and you will get errors. Read the README file in the top /contrib directory for the procedure. It is usually gmake all;gmake install . Don’t do the regression tests now because you aren’t inserting the SQL code that goes with the modules yet. Be aware that regression tests try to connect to the default post, that means the original server, and will probably fail if you run the new server in parallel on a different port later on. From the manual for contrib modules:

After a major-version upgrade of PostgreSQL, run the installation script again, even though the module’s objects might have been brought forward from the old installation by dump and restore. This ensures that any new functions will be available and any needed corrections will be applied.

Also, if this is version >= 8.3.0 and you are using an older version of tsearch you’ll need to read the manual pages Migration from Pre-8.3 Text Search. Loading a pre-8.3 database that contained contrib/tsearch into an 8.3 will cause problems. It caused my restore to fail. The manual suggests loading contrib/tsearch before restore because it will prevent loading the old objects from the old database.

Edit postgres.conf and other configuration files

Do a diff on your postgres.conf and pg_hba.conf files with the new version. If moving between major versions I would edit them by hand. When moving between minor versions last time I just copied the old over the new as suggested, and found errors in the log at startup because a configuration option had been removed and I was still calling it.

Backup your old database

Seems like an odd place to be performing the backup procedure doesn’t it? Yes, but the PostgreSQL manual has a good reason for it:

To make the backup, you can use the pg_dumpall command from the version you are currently running. For best results, however, try to use the pg_dumpall command from PostgreSQL 8.3.0, since this version contains bug fixes and improvements over older versions. While this advice might seem idiosyncratic since you haven’t installed the new version yet, it is advisable to follow it if you plan to install the new version in parallel with the old version. In that case you can complete the installation normally and transfer the data later. This will also decrease the downtime.

The reason here is that the pg_dump or pg_dumpall binary from the new release has bug fixes over the old version. Now it makes sense. With that in mind I backed up to a file first, just as a precaution.

/usr/local/postgres-8.3.0/bin/pg_dumpall -p 5432 -U postgres | bzip2 -zv --best > pre_8.3_dump.sql.bz2

This command is using the NEW version of the pg_dumpall binary (8.3.0) and telling it to run on the currently running PostgreSQL backend (8.2.6) which is on port 5432, and store the output in a bzipped filed in the current directory.

Start new PostgreSQL server on a different port


su - postgres
/usr/local/postgres-8.3.0/bin/postgres -p 5543 -D /usr/local/postgres-8.3.0/data >logfile-8.3.0 2>&1 &

This starts the new server in parallel to the old running at the same time by using port 5543 for restore and testing. We’ll change it back to the default port once we know everything works right. Make sure you ps aux to see both servers working. The first time I did this the random port I chose for the new server was in use by something else and PostgreSQL refused to start.

Loading the data

There are a couple ways to do this. You can reload the dump that you just saved, or you can pipe the output of the pg_dumpall command into the new postmaster process that is running on an alternate port. I usually prefer to go with the pipe.

/usr/local/postgres-8.3.0/bin/pg_dumpall -p 5432 -U postgres | /usr/local/postgres-8.3.0/bin/psql -p 5543 -U postgres

This runs a dump on the old server (port 5432) and immediately restores it into the new server (port 5543) using a pipe to psql. Use the pg_dumpall and psql binaries from the new version you just compiled to gain bugfixes and performace benefits.

Remember after you load the old data into the new server to apply the SQL files for any contrib modules to your new database. You should also check the output of your server file (logfile-8.3.0) for any notices errors using a grep command like cat -n logfile-8.3.0 | grep -i error . Also vacuum the entire cluster after reloading:
/usr/local/pgsql-8.3.0/bin/vacuumdb -p 5543 -a -z

Switch servers

You could run the new server on the alternate port for a period of time with some of your test or staging applications to make sure everything is working properly and that any of the feature changes that may affect your software have time to be changed. If you do this, then just reinit the cluster and do the parallel dump again when you are ready. If you don’t you’ll lose any data inserted in that time – which could be days.

Otherwise, making the switch is simple. Shutdown both servers, and then restart the new server on the same port that the old one ran on. If you started the original PostgreSQL with an init script then shut it down using that init script, edit the script to refer to the new version, and use that init to start the new server with /etc/init.d/postgres start

Lastly, remember to change your environment to point to the new postgres version. This isn’t strictly required but if you use psql on the command line often you should set your PATH and MANPATH to refer to the correct location. If your postgres is installed in /usr/local/postgres-8.3.0 for example, you could use a symbolic link called /usr/local/pgsql that points to the currently installed PostgreSQL version. That way you won’t have to change profile, or setup scripts for web based interfaces, etc.

Upgrading PostgreSQL to a major version takes some time, but it isn’t difficult. If you employ this same procedure for your minor version upgrades you’ll find it become second nature. Practise makes perfect.

Upgrading Postgres in parallel with your old version

Today I got caught up in PostgreSQL. A feature didn’t work as planned and when the mailing list responded to my plea for help it turned out the feature didn’t work fully in my version of Postgres, v.8.1.4.

Sad to say, I haven’t compiled a pgsql version for 13 months. I got so lazy that the last time I installed on my dev machine I just used the package that came with Kubuntu. Now I’m in a bind. I have a production machine and a dev machine to backup, upgrade and restore.

I learned one important lesson in this and that is to name your postgres installation directory the same as the current version. That kind of goes against my best judgement but it means that you can run two Postgres instances beside each other. Why would you want to do that? Take this line from the upgrade instructions:
Today I got caught up in PostgreSQL. A feature didn’t work as planned and when the mailing list responded to my plea for help it turned out the feature didn’t work fully in my version of Postgres, v.8.1.4.

Sad to say, I haven’t compiled a pgsql version for 13 months. I got so lazy that the last time I installed on my dev machine I just used the package that came with Kubuntu. Now I’m in a bind. I have a production machine and a dev machine to backup, upgrade and restore.

I learned one important lesson and that is to name your postgres installation directory the same as the current version. That kind of goes against my best judgement but it means that you can run two Postgres instances beside each other. Why would you want to do that? Take this line from the upgrade instructions:

To back up your database installation, type:

pg_dumpall > outputfile

To make the backup, you can use the pg_dumpall command from the version you are currently running. For best results … try to use the pg_dumpall command from [your new version of] PostgreSQL … since this version contains bug fixes and improvements over older versions. While this advice might seem idiosyncratic since you haven’t installed the new version yet, it is advisable to follow it if you plan to install the new version in parallel with the old version. In that case you can complete the installation normally and transfer the data later. This will also decrease the downtime.

Try to backup the existing database from the new version, which hasn’t been installed and can’t be installed until the old is backed up. That is, use the binary from your newly compiled version and run it on the old backend.

Seems like a problem, but if you run the new version in parallel you have a couple of benefits as I see it.

1. You don’t have to take your database down during upgrade! Once you are sure things are working on your new version then you simply flick the switch and it is nearly transparent. Don’t be like the company down the block who took their site down for 2 days to perform a db upgrade 🙂

2. If things go horribly wrong on your upgrade then you should still have the existing data, database and files to work from and find out the problem.

Keep these things in mind:

1. Change the install path by specifying it with the –prefix switch during ./configure Normally it installs to /usr/local/pgsql but because I already have a server installed and running there I chose to specify –prefix=/usr/local/pgsql-8.2.3

2. You can literally run it in parallel but change the port on the new server. In this fashion you can actually pipe the output of pg_dumpall to the input of the new server eliminating the SQL plain text file usually produced.

Make sure you block write access to the old version while performing the dump. When you are sure that everything dumped correctly you can shutdown the old server and start the new one on the old port.

There you have it. A nearly seamless database upgrade. A couple good reasons for running your new Postgres database upgrade in parallel with your current version.