views:

60

answers:

2

Is there a way to add a not null type column to a Postgresql table without doubling its size on disk? If for instance I have a table with some columns defined, and I want to add a column I'd do the following:

alter table my_table add new_column int  
update table my_table set new_column = 0 
alter table my_table alter column new_column set not null

This, in effect doubles the space that is allocated for a table because of the way Postgresql works. Updates are creating new tuples that will be marked for reuse after this transaction finishes and vacuum does its job. If table is large in size (ie. few million rows) but very slowly growing or is almost constant in size those rows will never be reused, and only a 'vacuum full' or a full database backup and restore will reclaim space on disk. Is there a way to automatically add a column with some default value but without this behavior? For instance if there would be a way to lock a table and do the update then there would be no need for MVCC in this case.

+1  A: 

do it in steps:

  1. alter table add new column
  2. alter table add default value for column
  3. update, but not whole table with 1 update statement, but issue it in like 10000 separate updates, each in its own transaction
  4. run vacuum every couple hundred updates, or better - autovacuum
  5. alter table set not null
depesz
That sounds ok, I think this will solve my problem, though it'll make upgrade script a little more complicated that the original 'alter table / update/ alter table' 3 rows. Tnx anyway.
Dragan Matic
A: 

If table is large in size (ie. few million rows) but very slowly growing or is almost constant in size those rows will never be reused, and only a 'vacuum full' or a full database backup and restore will reclaim space on disk.

That seems fishy. I wonder if your free space map is not large enough--when that happens, postgres starts losing track of deleted rows and cannot reclaim them except with a vacuum full. If you are running Postgres <= 8.3, Have you checked to make sure that your max_fsm_pages is large enough to keep track of deleted rows? To find out, do whatever it takes to get to this state where the table has dead rows that aren't being reused, and then do a "vacuum full verbose." Postgres will tell you about fsm problems at the end of the vacuum.

max_fsm_pages went away with Postgresql 8.4. If you are running 8.4, then never mind.

Wayne Conrad
Yes we're using postgres <= 8.3 and max_fsm_pages is large enough, that's not a problem. The main problem we have is that database grows significantly in size immediately after adding new columns to a few larger tables, and I would like to prevent that.
Dragan Matic