(apologies to
W.B.Yeats)
The other day Hubert Lubaczewski (a.k.a. depesz)
blogged about
pg_reorg. My
PostgreSQL Experts Inc. colleagues have been using it too, and they also like it. This is a utility to perform a cluster operation, i.e reorganize the table, but without requiring an exclusive lock on the table all the time it's running. That means many users who can not use CLUSTER at all today will be able to get the benefit, and many who only can use it at times of scheduled outages will be able to perform this operation much more frequently. That's a huge win.
I've been playing with it and it seems to be everything it's promised to be.
I dropped a note to Itagaki Takahiro a day or so ago asking him about making an updated release, and he quickly responded by releasing
version 1.1.7, which contains a fix for the "dropped columns" problem. Many thanks to him for that.
There are one or two things that could be improved. For example, I'd like to see it able to fall back to the Primary Key if there is no clustered index, in effect performing "ALTER TABLE foo CLUSTER ON pkindex" for every affected table that has a primary key but no clustered index. Since it needs a short exclusive lock on the table in order to install the trigger at the start of operations on the table, this should impose no additional lock burden. In the mean time I've cooked up a quick
recipe published on github to do the job.
Also, I'm a bit concerned about the fact that when it needs to swap the new table into place it will start cancelling queries and backends with conflicting locks. I'd like to see options for it to back off and retry, or fail, if it can't get the lock at the end. I haven't looked into it, so that might not be possible. But if it's not, users will still need to be very careful about when they run it if they might encounter long-running conflicting operations.
All in all, though, I can tell I'm going to find this very useful. Kudos to the authors for a clever piece of work.