ansaurus

Question

How to efficiently delete rows from a Postgresql 8.1 table?

Answer 1

A:

Why can't you delete the rows in the first place instead of adding them to the EmployeesToDelete table?

Or if you need to undo, just add a "deleted" flag to Employees, so you can reverse the deletion, or make in permanent, all in one table?

Ben Alpert 2009-04-22 15:42:20

Thanks for your response Ben.I need to store the list of "Employees" I plan to delete in a separate temp table because the logic to generate the list is quite complex, and I need to use it to determine what records I should delete from various other tables which also depend on the "Employees" table.

Jin Kim 2009-04-22 15:46:03

Also, I don't have the authority to modify the database schema at this point, as it would incur additional testing cycles from QA.

Jin Kim 2009-04-22 15:47:41

Answer 2

+1 A:

I'm not sure about the DELETE FROM ... USING syntax, but generally, a subquery should logically be the same thing as an INNER JOIN anyway. The database query optimizer should be capable (and this is just a guess) of executing the same query plan for both.

matt b 2009-04-22 15:42:50

Answer 3

+1 A:

I'm wondering if the following works and is more efficient?

    DELETE
    FROM    Employees e
    USING   EmployeesToDelete ed
    WHERE   id = ed.employee_id;

This totally depend on your index selectivity.

PostgreSQL tends to employ MERGE IN JOIN for IN predicates, which has stable execution time.

It's not affected by how many rows satisfy this condition, provided that you already have an ordered resultset.

An ordered resultset requires either a sort operation or an index. Full index traversal is very inefficient in PostgreSQL compared to SEQ SCAN.

The JOIN predicate, on the other hand, may benefit from using NESTED LOOPS if your index is very selective, and from using HASH JOIN is it's inselective.

PostgreSQL should select the right one by estimating the row count.

Since you have 30k rows against 260K rows, I expect HASH JOIN to be more efficient, and you should try to build a plan on a DELETE ... USING query.

To make sure, please post execution plan for both queries.

Quassnoi 2009-04-22 15:43:35

Answer 4

+3 A:

Don't guess, measure. Try the various methods and see which one is the shortest to execute. Also, use EXPLAIN to know what PostgreSQL will do and see where you can optimize. Very few PostgreSQL users are able to guess correctly the fastest query...

bortzmeyer 2009-04-22 16:24:37

ansaurus

tags:

views:

answers:

How to efficiently delete rows from a Postgresql 8.1 table?

related questions