Orange is my favorite color

In my previous post, I detailed my desire to convert from char(35) to Postgres’ native uuid datatype for storing ColdFusion UUIDs. I ran into a handful of problems and Andrew shared this tidbit of Postgres SQL that I haven’t used before:

ALTER TABLE uuidtest ALTER COLUMN uidTest TYPE uuid USING uidTest::uuid;

I ran this on my test table and sure enough it worked! I got the following data back when I selected all rows from the table:

"c5f25cc3-1d72-822b-7957-06e5a2c8f6b1"
"a5f01cc3-1d72-822b-7957-06e5a2c8f6b1"
"d6f25cc3-1d72-822b-7957-06e5a2c8f6b1"
"c5fc2cc3-1d72-822b-7957-06e5a2c8f6b1"
"e5f25cc3-1d72-822b-7957-06e5a2c8f6b1"

And these values passed the isValid(‘guid’, uidTest) test with flying colors – conversion complete! On this table at least. Since my production application actually has 75 tables, the next test was to try it with foreign keys:

CREATE TABLE uuidtest (
	uidTest CHAR(35) NOT NULL PRIMARY KEY
) WITHOUT OIDS;

INSERT INTO uuidtest VALUES ('C5F25CC3-1D72-822B-795706E5A2C8F6B1');
INSERT INTO uuidtest VALUES ('A5F01CC3-1D72-822B-795706E5A2C8F6B1');
INSERT INTO uuidtest VALUES ('D6F25CC3-1D72-822B-795706E5A2C8F6B1');
INSERT INTO uuidtest VALUES ('C5FC2CC3-1D72-822B-795706E5A2C8F6B1');
INSERT INTO uuidtest VALUES ('E5F25CC3-1D72-822B-795706E5A2C8F6B1');

CREATE TABLE uuidtest_ref (
	uidTest CHAR(35) NOT NULL REFERENCES uuidtest(uidTest) ON UPDATE CASCADE ON DELETE CASCADE
) WITHOUT OIDS;

INSERT INTO uuidtest_ref VALUES ('C5F25CC3-1D72-822B-795706E5A2C8F6B1');
INSERT INTO uuidtest_ref VALUES ('A5F01CC3-1D72-822B-795706E5A2C8F6B1');

I tried all combinations of transaction blocks, deferred constraints and disabled triggers but no love. All tests resulted in the same error, blammo:

ERROR: foreign key constraint "uuidtest_ref_uidtest_fkey" cannot be implemented
SQL state: 42804
Detail: Key columns "uidtest" and "uidtest" are of incompatible types: character and uuid.

I decided to hit up my local SFPUG mailing list in hopes of a miracle solution but Josh Berkus and Dirk Jagdmann suggested dropping and recreating the foreign keys as part of the transaction. That’s not great, but it’s a whole lot better than the copy-delete-rename dance typically required by these kinds of operations.

Test Case Solution

Based on all of the feedback so far, here is the code for the conversion process. It requires identifying all of the foreign keys tied to char(35) columns in advance:

BEGIN;

-- drop constraints
ALTER TABLE uuidtest_ref DROP CONSTRAINT uuidtest_ref_uidtest_fkey;

-- update all columns
ALTER TABLE uuidtest ALTER COLUMN uidTest TYPE uuid USING CAST(regexp_replace(uidTest, '([A-Z0-9]{4})([A-Z0-9]{12})', E'\\1-\\2') AS uuid);
ALTER TABLE uuidtest_ref ALTER COLUMN uidTest TYPE uuid USING CAST(regexp_replace(uidTest, '([A-Z0-9]{4})([A-Z0-9]{12})', E'\\1-\\2') AS uuid);

-- restore constraints
ALTER TABLE uuidtest_ref ADD CONSTRAINT uuidtest_ref_uidtest_fkey FOREIGN KEY (uidTest) REFERENCES uuidtest(uidTest) ON UPDATE CASCADE ON DELETE CASCADE;

COMMIT;

Production Solution

Ok, we’ve established in our simple test case this will work. Now how do we do this database-wide? There are some ways to automate this but it will still require a bit of cut and paste. Start by dumping out the schema for the database:

pg_dump --format=p --file=schema.sql --schema-only --clean --username=<username> <database name>

The –clean will ensure that we get both DROP and ADD statements for the constraints. Open this file up in a text editor. Starting right at the very top you will have the DROP statements that look like this and cut and paste them into the “drop” portion of your SQL transaction:

ALTER TABLE ONLY public.tbllookupevent DROP CONSTRAINT xpftbllookupeventtype_fk;

Then search ahead for the first instance of “REFERENCES” in the schema.sql file. Start copying and pasting that block of statements into the “restore” portion of your SQL transaction:

ALTER TABLE ONLY tbllookupevent
    ADD CONSTRAINT "$1" FOREIGN KEY (uidclub) REFERENCES tbllookupclub(uidclub)
ON UPDATE CASCADE ON DELETE RESTRICT;

Now the last part you have to worry about is generating all of the actual ALTER COLUMN statements. Ugh. That could be ugly. Except we can use the information_schema tables and some concatenation to spit this out straight from the database:

SELECT 'ALTER TABLE ' || TABLE_NAME || ' ALTER COLUMN ' || COLUMN_NAME || ' TYPE uuid USING CAST(regexp_replace(' || COLUMN_NAME || ', \'([A-Z0-9]{4})([A-Z0-9]{12})\', E\'\\\\1-\\\\2\') AS uuid);' AS SQL
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = 'public'
AND data_type = 'character'
AND character_maximum_length = 35
AND table_name NOT IN (SELECT table_name FROM information_schema.views)
order by TABLE_NAME, ORDINAL_POSITION

We add a little extra filtering to be sure we don’t include views (which we don’t want to ALTER). The output looks like:

"ALTER TABLE tblmapthemes ALTER COLUMN uidclub TYPE uuid USING CAST(regexp_replace(uidclub, '([A-Z0-9]{4})([A-Z0-9]{12})', E'\\1-\\2') AS uuid);"
"ALTER TABLE tblmapthemes ALTER COLUMN uidtheme TYPE uuid USING CAST(regexp_replace(uidtheme, '([A-Z0-9]{4})([A-Z0-9]{12})', E'\\1-\\2') AS uuid);"
"ALTER TABLE tblmappackages ALTER COLUMN uidpackage TYPE uuid USING CAST(regexp_replace(uidpackage, '([A-Z0-9]{4})([A-Z0-9]{12})', E'\\1-\\2') AS uuid);"

Now take those results and paste them between your DROP and CREATE statements. Be sure you have the BEGIN and COMMIT transaction wrapped around it so if something blows up it will roll back the entire set of modifications. Now let it rip!

I haven’t run this entire transaction yet as I still need to test using GUIDs with Transfer and explore changing my isValid(‘uuid’, …) checks in my application to use the GUID instead. I will also need to write a either a mod_rewrite rule or some Model-Glue controller code to look for and convert legacy 35-character UUIDs into the new 36-character format.

Hope this is interesting – please leave a comment if you attempt this conversion and if you run into any other gotchas!

5 Comments

  1. Andrew said:

    on December 23, 2008 at 7:41 am

    I’ve been thinking about converting CF UUIDs to PG’s UUIDs, and I was wondering if you’d considered the following:

    I only have a couple cf uuid columns in my database, so I was thinking of wrapping any sql that accesses the columns with functions like:

    CREATE OR REPLACE FUNCTION pg_uuid_to_cf_uuid(uuid) RETURNS text AS
    ‘SELECT substring($1::text, 0, 24) || substring($1::text, 25)’
    LANGUAGE ’sql’ IMMUTABLE STRICT;

    CREATE OR REPLACE FUNCTION cf_uuid_to_pg_uuid(text) RETURNS uuid AS
    ‘SELECT (substring($1, 0, 24) || ”-” || substring($1, 24))::uuid’
    LANGUAGE ’sql’ IMMUTABLE STRICT;

    Since I convert all the values at the DB level I can still use CF’s uuid functions…

    Alternatively, I was thinking of creating the equivalent functions in CF, and use the functions in all my cfqueries… but then I couldn’t figure out a nice way to handle the selects that return PGobjects for the uuid columns.

  2. Brian said:

    on December 23, 2008 at 10:00 am

    @Andrew – you could definitely do that although you would have to update all of your queries to use those new functions. If you only have a few UUID fields, you might simply wait for PostgreSQL 8.4 which will accept the CF format on input and then use the function above or my regex to add the hyphen and upper() on output.

    The more I look at it, the more I think of it as switching from UUID to GUIDs. ColdFusion’s UUID generator is terribly slow anyways and using the 36-character UUID/GUID makes the database and the application more compatible with the rest of the Interweb.

    It’s easy to generate GUIDs with ColdFusion and there is an isValid() function for them… those eliminate the barriers in my mind.

  3. Andrew said:

    on December 24, 2008 at 5:24 am

    If you want to use GUIDs in CF then one of the biggest problems is CF’s lack of support for the PG UUID data type. I haven’t figured out a nice way to handle the PGobjects that are returned in a cfquery when you select a UUID column. That’s the main reason that I thought about using the above functions.

  4. brian said:

    on December 24, 2008 at 1:39 pm

    @Andrew – CF handles it implicitly (at least on my default CF8 install). The following code:

    
    <cfquery datasource="dsn" name="uuidtest">
    	SELECT *
    	FROM uuidtest
    </cfquery>
    <cfoutput>#uuidtest.uidTest#: #yesNoFormat(isValid('guid', uuidtest.uidTest))#</cfoutput>
    

    Outputs:

    c5f25cc3-1d72-822b-7957-06e5a2c8f6b1: Yes

    No problems whatsoever – are you seeing something different?

  5. Felecia Berger said:

    on January 9, 2009 at 12:34 pm

    hi
    2xlv4cmgqfwmskfa
    good luck

{ RSS feed for comments on this post}