ansaurus

Question

move data from one table to another, postgresql edition

Answer 1

A:

You might dump the table data to a file, then insert it to another table using COPY Usually COPY is faster than INSERT.

pcent 2010-06-04 12:32:34

I've made some tests processing large amounts of data using triggers, row by row, and using a stored procedure with a single transaction.The stored procedure approach was faster.

pcent 2010-06-04 12:45:49

You should also fine tune your PostgreSQL server to enhance the performance. Read:http://wiki.postgresql.org/wiki/Performance_Optimization

pcent 2010-06-04 12:48:23

yah, I think that guideline should be qualified to say that one COPY is faster than a set of INSERT statements, one per row. INSERT...SELECT for copying data around I would think was optimal since the data isn't being passed outside the executor.

araqnid 2010-06-04 13:12:22

Copy is going to be faster than insert FOR EXTERNAL DATA. OP is working with data already in the database so an insert is going to be faster than exporting then copying back.

Scott Bailey 2010-06-04 23:13:13

Answer 2

+5 A:

If the condition is so complicated that you don't want to execute it twice (which BTW sounds unlikely to me, but anyway), one possibility would be to ALTER TABLE ... ADD COLUMN on the original table to add a boolean field, and run an UPDATE on the table to set that field to true WHERE <condition>. Then your INSERT and DELETE commands can simply check this column for their WHERE clauses.

Don't forget to delete the column from both source and destination tables afterwards!

Hmm, even less intrusive would be to create a new temporary table whose only purpose is to contain the PKs of records that you want to include. First INSERT to this table to "define" the set of rows to operate on, and then join with this table for the table-copying INSERT and DELETE. These joins will be fast since table PKs are indexed.

[EDIT] Scott Bailey's suggestion in the comments is obviously the right way to do this, wish I'd thought of it myself! Assuming all the original table's PK fields will be present in the destination table, there's no need for a temporary table -- just use the complex WHERE conditions to insert into the destination, then DELETE from the original table by joining to this table. I feel stupid for suggesting a separate table now! :)

j_random_hacker 2010-06-04 12:45:20

The temp table gets my vote. Updating rows and then deleting them means creating a lot of garbage in the heap, as well as requiring touching the table schema (not that that really matters)

araqnid 2010-06-04 13:15:23

+1 for the temp table for PKs.

rfusca 2010-06-04 13:34:21

You won't need the temp table or to do an expensive calc twice. Do the calculation once as you insert into new table. Then do a delete from old table where record is in new table.

Scott Bailey 2010-06-04 23:16:13

Destination table will have plenty of data as well, so this DELETE statement is potentially big. Your idea is good, but I'm still looking for something faster.

IggShaman 2010-06-06 20:20:51

@IggShaman: Although I wouldn't rule it out, I can't see how anything could be much faster, short of writing a C extension that somehow rewires the existing rows into the new table at the disk level (which is probably impossible anyway). BTW if your destination table has an index on it that includes all the PK fields of the source table, PostgreSQL will just read the index instead of the entire table.

j_random_hacker 2010-06-07 02:05:01

ansaurus

tags:

views:

answers:

move data from one table to another, postgresql edition

related questions