Thursday, May 17. 2012Adding enum labels isn't transactional
Back when we added a facility to add new labels to an enum, some technical considerations made us adopt a restriction that this must not take place within a transaction.block. David Wheeler has been complaining about this lately, including cornering some of us in the pub last night. His chagrin is understandable, since all the other changes in his migration scripts.are transactional and can be rolled back. Unfortunately this isn't something that is easy to overcome. What we can do, and should for 9.3, is to make the label addition command more resilient by adding an "if not exists" option. That way if the label has already been added to the type the command won't generate an error.
Saturday, May 12. 2012Upgradable JSON
If you're using my backport of the 9.2 JSON feature in your 9.1 Postgres, you have a problem if you want to use pg_upgrade when moving to release 9.2: the type has the same name and behaviour in 9.1 as in 9.2, but the OIDs are different and so the upgrade will fail. If you use pg_dump/pg_restore, you should just be able to exclude the restoration of the extension and everything will "just work".
But after a lot of trial and error I have come up with a different way to do things. In this scenario we don't install as an extension, and instead we borrow a bit of magic from pg_upgrade to make sure that the JSON type and its array type have the same OIDs as they will have in PostgreSQL 9.2. We also need to fool pg_upgrade when it runs, as it goes looking for a loadable module called json.so, and of course in 9.,2 there isn't one because JSON is built in. So we need to supply a dummy one. But, given that, this too "just works". There might be some wrinkles I haven't foreseen, but I have done this quite a few times now without mishap. This is checked in on a separate branch of the json_91 repo callled "binup" If you are going to use this backport, I also highly recommend using the backport of the fix for missing column names on row expressions. This requires patching Postgres, but it's worth doing anyway, even if you're not using JSON - notably it affects some uses of hstore as well. You can get this by pulling the port into a git repo set at REL9_1_STABLE by doing: Enjoy.git pull https://bitbucket.org/adunstan/pgdevel.git rowexpr91 Thursday, May 10. 2012Deploying file_text_array Foreign Data wrapper
Today, a moment planned 18 months ago arrived when I was able to deploy the file_text_array foreign data wrapper for the client for whom it was originally designed. It's taken this long for the client to move to the 9.1 release, and they suddenly emailed me today saying they needed the old ragged CSV feature I'd previously provided them. I explained to them that this was meant to replace it, and it took me all of about 5 minutes to deploy, and a further hour or two to adjust and test the code in the ETL tool that is meant to use it. It's all working flawlessly, I'm happy to say.
Tuesday, May 8. 2012Mixing compilers
Regina has been reporting problems with running my Fixed Length Record Foreign Data Wrapper on Windows. I tried it yesterday on my MinGW 64 bit setup and it worked just fine. It turns out she's been trying to run a mingw compiled FDW DLL with a VC++ built Postgres. I'm not sure that we actually expect mixing compilers like that to work. In fact, I'm not sure how to build extension modules for VC++ builds at all. My approach would probably be to build them in tree, possibly modifying the build scripts slightly, but I haven't tried. It would be nice if this were documented better.
Thursday, April 26. 2012Function properties
Some years ago I floated an idea for decorating functions with properties. I forget exactly the context in which it occurred to me, but the suggestion met a very frosty reception indeed, and, discretion being the better part of valour, I didn't pursue it.
But the notion of doing something like this has twice come back to me in the last week or so, once in conjunction with something I was discussing about PLV8, and once in conjunction with a proposal I have been discussing to do with making PLPerl more efficient. The PLPerl issue is easily stated: when it's dealing with records, including records passed to PLPerl as arguments, records returned as results, and records fetched via SPI, the PLPerl glue code turns the records into perl hashrefs, and expects to be handed hashrefs by plperl functions. Now hashrefs are horribly inefficient when handling bulk data, and massively redundant when it comes to sets of database rows, which all have the same field names . Passing data as arrayrefs is much more efficient. But doing this means that the perl code would lose the field names, and would need to know to provide / expect arrayrefs rather than hashrefs. If we made a change like this globally, it would break a mountain of legacy code. So we want to be able to do this on a per function basis. One idea is to create a separate language to handle data this way. It could share a lot of code with the existing language, possibly even sharing interpreters, but we'd know to provide and expect arrayrefs instead of hashrefs for records. But this solution is really a bit ugly and kludgy. So I'm currently playing around with the idea of leveraging an existing capability that we added in 8.3, namely the ability to set a configuration parameter as part of a function's definition. One nice part about this is it wouldn't need to change any existing catalogs etc. The only affected code (I hope) would be in the language's glue code. There are probably a few nasty wrinkles. but I'm hopeful it will work. The idea is we'd do something like: Another linux divide
I haven't been following the debates over a replacement for the old init system. But this just annoys me. As a Postgres consultant I get to work on various systems, and having to learn not one but two or three new ways of starting services on Linux is a pain in the neck.
Monday, April 23. 2012Moving ahead with PLv8
Hitoshi Harada has been doing some work on PLV8, and so have I. Recently I added a facility to run a function on interpreter first call. So if you have a setting "plv8.start_proc" that contains the name of a PLV8 function, that function will be run on first call. The nice thing about this is that you can use it to implement a load_module facility, and in fact that's what the example code in the regression test does. In case you're wondering, load_module can only load code from a table, not from the file system. PLV8 is a trusted language and has no access to resources outside the database. The startup code leverages a facility that Hitoshi provided recently to look up functions by their Postgres names from within the JavaScript code.
After I did that Hitoshi came up with a pretty neat way to solve one of the biggest issues still on the list, namely that all PLV8 code was run in a single context. With his fix, we instead create a separate "global" context for each user id, and the startup code is executed in that context as it's created. This means we can no longer have information leaking across role changes, which has been a security issue that we solved for the core languages not long ago. (At the time of writing this last change hasn't been committed yet, but I'm sure it will be very soon.) The hope is that we can have PLV8 just about production ready by the time pgCon comes around in a few weeks, and at the current rate of progress there's a very good chance we will. Thursday, April 19. 2012On Bugzilla
Years ago, Bugzilla only ran on MySQL, except for RedHat's fork, which ran on Postgres. The effort to merge in RedHat's changes in a maintainable way had stalled badly, and it seemed to me that the direction was probably wrong anyway. I spent a few days working on the code to make the basic schema code database neutral. To get the actual schema to be used for database setup, Bugzilla takes this neutral structure and massages it to work with the database in question, so that, for example, a field with a type of MEDIUMTEXT will have type mediumtext in MySQL, varchar(4000) in Oracle, and text in PostgreSQL and SQLite. (When I did this work, Oracle and SQLite weren't supported, but adding them would have been pretty easy, I believe.) I don't even remember what my original motivation was. At the time Bugzilla was just about the only open source bugtracker out there, certainly by far the most prominent in any case, and I thought it was important to have one that was able to be Postgres backed. A good deal of how it works is explained in the Bugzilla development docs. And that's why I tend to mention Bugzilla when the subject of trackers comes up, as it does every year or so.
Fast forward to the present day, and Tom Lane's recent remark that he dislikes Bugzilla a lot, based on his experience at RedHat, among other reasons because you tend to get lots of instances of the same bug for different releases. So I looked at the code, and sure enough, the field of a bug that identifies the version of the component that a bug applies to is a simple text field. Now it would probably be fairly easy to fix this, and the folks at Bugzilla are very Postgres-friendly. You can't use arrays, but an added bug_component_versions table would probably do the trick, along with code changes in the app, of course. But this is a classic case of a system that's fairly mature carrying with it possibly careless design decisions from long ago that make people hate it today. Bugzilla was designed like many systems as a quick and dirty gadget to meet an immediate need. I suspect it's in major need of a do-over. Monday, April 9. 2012row_to_csv
You can get rows as CSV lines by using the COPY command. This works well, but there are times when it's not convenient. For example, one of my clients uses software where the driver doesn't support COPY mode, unfortunately. We could add support for libpq's COPY functions to the driver, but that would involve quite a bit of work. Another possibility occurred to me, namely that it would be easy to produce rows as CSV via a function. This took only a few hours to produce the other day, based on my earlier work turning rows into json. It's not quite ready for prime time yet, but I wonder if this has more general use than my client's particular case. Here's an example:
The idea is to allow options to the functions similar to COPY's delimiter, null_string, quote and escape options, and possibly force quote as well. That part isn't done yet Sunday, April 1. 2012Cross-compiling PostgreSQL for WIndows
Many of our developers don't run Windows and don't want to. And that's understandable - as a development platform it can be challenging, especially for people working on traditional Unix type projects. But Postgres is widely deployed on Windows, and in some pretty big deployments too (for example, HP's). And it's sufficiently different that it's easy to break things. One of the things that Linux and OSX users can do is to cross-compile for Windows. Of course, you can't run it without running Windows, but just getting over the build hurdle is a start. And it's incredibly easy to do. Just grab the linux cross compiler from Sourceforge. For example, to build 32-bit WIndows binaries on my x86-64 bit Linux machine, I would grab mingw-w32-bin_x86_64-linux_20120327.tar.bz2.
Just unpack it, add the binaries to your path and you're away: [andrew@emma ~]$ sudo mkdir /usr/local/mingw [andrew@emma ~]$ sudo tar -j -C /usr/local/mingw -xf mingw-w32-bin_x86_64-linux_20120327.tar.bz2 [andrew@emma ~]$ mkdir nxcomp [andrew@emma ~]$ cd nxcomp [andrew@emma nxcomp]$ export PATH=/usr/local/mingw/bin:$PATH [andrew@emma nxcomp]$ /path/to/postgresql.source/configure --without-zlib --host=i686-w64-mingw32 --prefix=`pwd`/postgres [andrew@emma nxcomp]$ make [andrew@emma nxcomp]$ make install You can actually zip up and run the resulting postgres on Windows, but there is something not yet right . The libpq appears to have a problem, and there are odd things that happen with contrib too. I'm going to follow those up, and also look at building with third party libraries like zlib, openssl and libxml2. I'll blog about those when I have more news. Thursday, March 29. 2012Looking at the table of contents of a dump.
A while ago I created a little utility to look at the metadata (the "TOC" or Table Of Contents) of a PostgreSQL Custom Format dump. It turns out that, with a tiny adjustment, this can be used to dump the toc.dat file from a Tar or Directory Format dump in a readable format as well. I've committed this change to the utility repository on github. It's quite useful in reviewing the parallel pg_dump patch.
The utility can also be used to dump out all the SQL in separate files. That part needs a bit of polish, but it's a start. Josh Berkus has been using it at one of our customers. Saturday, March 17. 2012GIthub Spam - Urgent action needed
Bruce just complained about getting spam via GitHub following a PostgreSQL commit he made. I've had a look to see if there is a way to disable comments on a repo on GutHub, and I couldn't find it. Neither could Magnus. That's just awful. There's no reason that the Postgres mirror on GitHub should be getting comments. Nobody is going to look at them, anyway. They are nothing more than a spam vehicle. The good folks at GitHub should see to this urgently.
Tuesday, March 6. 2012Text files from a remote source
As one of my customers who uses my old RAGGED CSV patches is getting close to moving to Postgres 9.1, it's time for me to dust off the file_text_array Foreign Data Wrapper I wrote a year or so ago and make sure it's up to date and will work for them. I'm not going to maintain the RAGGED CSV patches for releases later than 9.0, and this lets then move to using vanilla Postgres build rather than ones that are specially prepared for them.
The file_fixed_length FDW needs some love as well. As I mentioned the other day in reply to Peter Van Hardenberg of Heroku, it would be nice to have these not just read from a local text file. In the Heroku context doing that probably doesn't make much sense - a Heroku user probably has no way to put the file on the database server. But it occurred to me that making them work on a URL could well work. By linking the FDWs with libcurl, say, we could have the FDW fetch the contents of some URL and treat it as a CSV, (or Postgres text format or fixed length records). In the case of the file_text_array FDW we'd possibly need to parse it out ourselves, unfortunately, but that's not wholly undoable, and luckily the COPY API lets us do that, as is tested by the file_fixed_length FDW. An alternative would be for the FDW to put the contents in a local file and then have COPY read that. Anyway, I'm about to start polishing these up a bit, and then I'll look at having them fetch from a URL. Could work quite nicely. Saturday, March 3. 2012Heroku, a really easy way to get a database in a hurry.
If you've ever needed a database in a real hurry, Heroku's PostgreSQL offering which runs on top of Amazon, might be just what you need. Once you have registered with your credit card, getting a running database is about three mouse clicks away. That's pretty sweet. You don't need hardware, or a DBA, to get started.
One of the things that's mildly annoying is that there isn't a simple way to start a standalone database instance with the command line tool. I'm told that this is being addressed. There's a workaround, which is to set up a dummy Heroku application (I just followed the instructions to set up a Django app) and then add a database instance to it. I gather that's what the web interface does under the hood. It's slightly hokey, but works just fine. Some things worth noting. You won't be a superuser of the database, just the owner. And they use extension whitelisting so they control what extensions can be loaded. That means you're restricted to functions written in PLPgsql or SQL. No PLPerl, PLPythonu or PLTcl. I imagine that's at least in part because of security concerns. Maybe when we get PLV8 to production quality they will enable it. It has the nice property that it's automatically sandboxed, and people might feel more secure deploying it. Here's the list of whitelisted modules: btree_gist chkpass cube dblink dict_int dict_xsyn earthdistance fuzzystrmatch hstore isn ltree pg_trgm pgcrypto pgrowlocks pgstattuple plpgsql unaccent uuid-osspThat will cover a fairly wide range of apps. I'd like to see them preload autoexplain but with all the settings off, so it could be used on the fly. I think it should be safe enough. If you deploy this service you should add on one of the Heroku PostgreSQL backup plans too. They all seem to be free currently, so there is no reason not to use them. Don't let any of this fool you into thinking that this is a "set and forget" service, any more than any other database installation is. Some of my colleagues and I are starting to get calls for assistance with using it. It's just as possible to write bad queries for this service as for any database instance, and like any instance it can be affected by configuration parameters, which as a non-superuser with only remote access you won't have much control over here. But for many cases this will be a compelling solution to their database needs. Final whine: I do wish Heroku had a "running tab" meter in something close to real time. Apart from that though, they seem to have done a pretty nice job. Monday, February 20. 2012Pretty Printing SQL
Today I committed the changes to improve the way Postgres pretty prints view definitions, which I've blogged about before. I'me very happy about that - it will ameliorate a long-standing annoyance. Pavel Stehule welcomed the change, but asked if it could be extended, so that we could do something like:
That's a nice idea. Given that the logic for pretty printing is there already, it would be nice to be able to use it on an arbitrary query. And it possibly wouldn't be terribly difficult. Worth putting on the list of things to do for 9.3. While we're on the subject of things to do for 9.3, Tom Lane came up with a much better way of getting psql to do binary output than the one I had, and I'm definitely going to work on that for 9.3.
(Page 1 of 18, totaling 267 entries)
» next page
|
My Links etcBlog AdministrationCalendarQuicksearchArchivesCategoriesSyndicate This Blog |
