PostgreSQL temporary tables

Nicholas,

Please not that, in Postgres, the default behaviour for temporary tables is that they are not automatically dropped, and data is persisted on commit. See ON COMMIT.

There are multiple considerations you have to take into account:

Will the temporary table survive the session (and persist on disk)? If you do not want it to, then explicitly DROP the table just before COMMITing, or create the table with the CREATE TEMPORARY TABLE ... ON COMMIT DROP syntax.
While the temporary table is in-use, how much of it will fit in memory before overflowing on to disk? See the temp_buffers option in postgresql.conf
Anything else I should worry about when working often with temp tables? A vacuum is recommended after you have DROPped temporary tables, to clean up any dead tuples from the catalog. Postgres will automatically vacuum every 3 minutes or so for you when using the default settings (auto_vacuum.)

Also, unrelated to your question (but possibly related to your project): keep in mind that, if you have to run queries against a temp table after you have populated it, then it is a good idea to create appropriate indices and issue an ANALYZE on the temp table in question after you're done inserting into it. By default, the cost based optimizer will assume that a newly created the temp table has ~1000 rows and this may result in poor performance should the temp table actually contain millions of rows.

Cheers, V.

Good stuff. Thx. I actually only used a temp table since I needed to execute two different SELECTs on it (so an Analyse would not be worth it, I fancy). I provided the operations with lots of temp_buffers, yet since TEMP tables were being created and dropped by many python threads, ...

Nicholas Leonard 2009-02-28 16:07:50

postgres was eating up more and more RAM as the script did its job. I found that limiting the amount of python threads (running on a client computer) to a little more than the amount of cpu-cores, gave the best (most efficient and effective) execution times. Thx again for you wisdom Vlad.

Nicholas Leonard 2009-02-28 16:10:37

Even if you only SELECT on the temp table twice, investing a few milliseconds in an index creation + ANALYZE each time you create the temp table could save you tons when/if joining other tables with the temp table - put the queries manually in PgAdminIII and use the "Query/Explain(F7)" function.

vladr 2009-02-28 16:46:48

Really? Ok, I guess I needed to have someone tell me to try it since it seems counter intuitive (setup costs do not seem to be worth it). Anyway, I thank you and I will try to analyse the ANALYSE next time. I am already seeing the value of TEMP INDEXs thought. Yet I wonder if an ANALYSE is really...

Nicholas Leonard 2009-03-01 15:16:09

worth it when the query optimizer has been configured in such a way to "strongly encourage it" to use the INDEX? Thx again Vlad.

Nicholas Leonard 2009-03-01 15:17:17

The ANALYZE overhead is on average 100ms, and you can configure it per-table/column. You absolutely need an ANALYZE in order for the optimizer not to make any stupid assumptions assuming that a million-row table only contains 100 rows and table-scanning it 10 times... :)

vladr 2009-03-01 22:10:25

In other words, without doing ANALYZE you cannot say you have encouraged the optimizer in one way or another, as the optimizer unconditionally uses the ANALYZE data to make decisions.

vladr 2009-03-02 06:06:47

what about the postgres.conf variables like sequetialscan = false?

Nicholas Leonard 2009-03-14 23:14:39

only use enable_seqscan to debug the CBO. trying to convince postgres to use an index by setting this is like fixing a tv with a sledgehammer - or a headache with a lobotomy. Plus there's more than "to use an index or not to use an index" in the life of a database...

vladr 2009-03-15 04:28:47

...such as "use a nested loop with index lookup" vs. "use a hash aggregate" (and a hash aggregate might be much faster than index lookup!), questions which can only be answered by analyze statistics, not by some "don't do tablescans" configuration variable

vladr 2009-03-15 04:29:53

ansaurus

tags:

views:

answers:

PostgreSQL temporary tables

related questions