views:

1018

answers:

3

Hi again,

I am currently cleaning up a table with 2 indexes and 250 million active rows and approximately as many dead rows (or more). I issued the command VACCUM FULL ANALYSE from my client computer (laptop) to my server. It has been going about its business for the last 3-4 days or so; I am wondering if it will end anytime soon for I have much work to do!

The server has a quad-code Xeon 2.66 GHz processor, 12 GB or RAM and a RAID controller attached to 2 x 10K rpm 146 GB SAS HDs in a RAID 1 configuration; it is running Suse Linux. I am wondering...

Now, firstly the VACUUM postmaster process seems to be making use of only one core. Secondly, I am not seeing a very high I/O writes to I/O idle time ratio. Thirdly, from calling procinfo, I can extrapolate that the VACUUM process spends most of its time (88%) waiting for I/0.

So why isn't it utilizing more cores through threads in order to overload the RAID controller (get high I/O writes to idle ratio)? Why is it waiting for I/O if the I/O load isn't high? Why is it not going faster with all this power/resources at its fingers? It seems to me that VACUUM can and should be multithreaded, especially if it is working on a huge table and it is the only one working!

Also, is their a way to configure postgresql.conf to let it multithread such VACUUMs? Can I kill it and still benefit from its partial clean-up? I need to work on that table.

[I am using PostgreSQL 8.1]

Thx again

+1  A: 

You don't say what version of PostgreSQL you are using. Is it possible it is pre-8.0?

I had this exact same scenario. Your best best:

  • kill the vacuum
  • back up the table with pg_dump -t option
  • drop the table
  • restore the table

If you are using 8.x, look at the autovacuum options. Vacuum is single threaded, there's nothing you can do to make it use multiple threads.

Mark Harrison
You say to kill the vacuum and then back up the table, what will result of VACUUM's death? I like your idea of drop and restore. Thx
Nicholas Leonard
Nothing bad happens when the vacuum is killed, you just lose the work done in reclaiming the tablespace so far. We had a job that would automatically kill any vacuums at 8:00am, so that users would not be stuck when they came in. If this happened, we would then dump/restore the next evening.
Mark Harrison
Might be a good idea to set a cron job for vacuuming from now on.
Calyth
+1  A: 

Some quick tips:

  • Run VACUUM FULL VERBOSE so you can se what is going on.
  • Drop all indexes before the VACUUM. It's faster to rebuild them than vacuum them. You also need to rebuild them now and then because VACUUM FULL isn't good enough (especially on such an old PosgreSQL as 8.1).
  • Set the maintenance_work_mem really high.
  • Use a newer PostgreSQL. Btw, 8.4 will have an huge improvement in vacuuming.

An alternative to VACUUM is to dump and restore.

Martin Torhage
is pg 8.4 multithreaded?
Nicholas Leonard
A: 

Are you sure you don't have anything ongoing that could lock tables and prevent vacuum from running ?

(Anyway, it's best to use vacuum_cost_delay so that vacuum is not disruptive to production.)