ansaurus

Question

Answer 1

+7 A:

Why loop at all?

delete from t1 where id IN (select id from source where import_source = 'bad.txt';
delete from t2 where id IN (select id from source where import_source = 'bad.txt';
delete from source where import_source = 'bad.txt'

That's using standard SQL. I don't know Oracle specifically, but many DBMSes also feature multi-table JOIN-based DELETEs as well that would let you do the whole thing in a single statement.

Larry Lustig 2009-12-03 17:20:22

+1 and a Coke for beating me for a couple of secs.

Otávio Décio 2009-12-03 17:25:05

Not possible. Due to how large the tables are I *must* do this in a loop, commiting as I go.

David Oneill 2009-12-03 17:28:54

I'm not an Oracle person so I can't dispute your DBA's statement, but I find it hard to believe that Oracle doesn't provide some kind of setting to commit those DELETEs as the command executes. It's nuts that you can't issue such a simple SQL command against a major SQL database.

Larry Lustig 2009-12-03 17:58:39

Hmm. I'll ask our DBA about that.

David Oneill 2009-12-03 18:10:06

+1 Large DELETE are not a problem, just size your rollback segments appropriately

Vincent Malgrat 2009-12-03 18:44:50

Large deletes are a problem. Our DBA has forbid them. I am going to ask him about if there is a setting where it can commit each step (as Larry suggested a few comments back), but otherwise this is not possible.

David Oneill 2009-12-03 18:59:16

A complicating factor is not only the rollback for the delete, but the rollbacks for all the inserts and updates that happen WHILE THE DELETE IS RUNNING. Unless I hear otherwise from our DBA, one large delete is NOT an option.

David Oneill 2009-12-03 19:01:20

Oracle does indeed support deletes against an inline view as long as it is key-preserved.

David Aldridge 2009-12-04 00:13:16

Answer 2

+1 A:

First of all, you shouldn't commit in the loop - it is not efficient (generates lots of redo) and if some error occurrs, you can't rollback.

As mentioned in previous answers, you should issue single deletes, or, if you are deleting most of the records, then it could be more optimal to create new tables with remaining rows, drop old ones and rename the new ones to old names.

Something like this:

CREATE TABLE new_table as select * from old_table where <filter only remaining rows>;

index new_table
grant on new table
add constraints on new_table
etc on new_table

drop table old_table
rename new_table to old_table;

See also Ask Tom

Majkel 2009-12-03 17:26:37

I *must* commit in a loop. This is an order from our DBA - if I don't, the rollback information will be too large and will crash our system.I am deleting a small minority of the records, so the create new and drop is not feasible.

David Oneill 2009-12-03 17:30:49

Committing in loop can give you ORA-01555 errors, not mentioning poorer performance. Your DBA should rather increase rollback segment size.

Majkel 2009-12-03 17:42:11

Not feasible. I expect this delete to take hours to run. We can't have tables on our production database locked for that long.

David Oneill 2009-12-03 18:03:57

With constraints you mentioned, I think you will be best off with egorius's solution then. It does what you want while minimizing number of context switches. I would recommend testing for the maximum amount of operations between commits - if you have so many rows to delete, it should give measurable performance boost (less redo = less writes to disk for redo log, among other things).

Majkel 2009-12-03 21:49:12

Answer 3

+1 A:

Larry Lustig is right that you don't need a loop. Nonetheless there may be some benefit in doing the delete in smaller chunks. Here PL/SQL bulk binds can improve speed greatly:

declare
type import_ids is table of integer index by pls_integer;
my_count integer := 0;
begin
select id bulk collect into my_import_ids from source where import_source = 'bad.txt'

forall h in 1..my_import_ids.count
  delete from t1 where id = my_import_ids(h);
forall h in 1..my_import_ids.count
  delete from t2 where id = my_import_ids(h);

The way I wrote it it does it all at once, in which case yeah the single SQL is better. But you can change your loop conditions to break it into chunks. The key points are

don't commit on every row. If anything, commit only every N rows.
When using chunks of N, don't run the delete in an ordinary loop. Use forall to run the delete as a bulk bind, which is much faster.

The reason, aside from the overhead of commits, is that each time you execute an SQL statement inside PL/SQL code it essentially does a context switch. Bulk binds avoid that.

Dan 2009-12-03 17:34:25

Your first point: yes, I commit ever 500 or so rows.

David Oneill 2009-12-03 18:01:26

2nd point: can you provide more details how forall works? How can your query be re-written to use forall, but still commit in batches

David Oneill 2009-12-03 18:02:31

Answer 4

A:

You may try partitioning anyway to use parallel execution, not just to drop one partition. The Oracle documentation may prove useful in setting this up. Each partition would use it's own rollback segment in this case.

Adam Hawkes 2009-12-03 18:25:42

Partitioning the t1, t2, etc tables is not an option due to considerations outside my control.

David Oneill 2009-12-03 18:30:33

Well, then I answer like this:1. Both your solutions are optimal for your circumstances.2. You can determine which is better by testing. Do both in a simulated environment and see what happens.

Adam Hawkes 2009-12-03 21:02:03

Answer 5

+6 A:

David, If you insist on commiting, you can use the following code:

declare
  type import_ids is table of integer index by pls_integer;
  my_import_ids import_ids;
  cursor c is select id from source where import_source = 'bad.txt';
begin
  open c;
  loop
    fetch c bulk collect into my_import_ids limit 500;
    forall h in 1..my_import_ids.count
      delete from t1 where id = my_import_ids(h);
    forall h in 1..my_import_ids.count
      delete from t2 where id = my_import_ids(h);
    commit;
    exit when c%notfound;
  end loop;
  close c;
end;

This program fetches ids by pieces of 500 rows, deleting and commiting each piece. It should be much faster then row-by-row processing, because bulk collect and forall works as a single operation (in a single round-trip to and from database), thus minimizing the number of context switches. See Bulk Binds, Forall, Bulk Collect for details.

egorius 2009-12-03 20:33:52

With the commit inside the cursor I wonder whether it might be best to intorduce an ORDER BY into the cursor select in order to make sure that all values have been read from the source table prior to commiting, hopefully reducing the chances of a snapshot too old error. you'd want to check the execution plan to make sure that a sort is being performed of course.

David Aldridge 2009-12-04 07:52:07

David Aldridge: Hmm, do we need to delete rows from the "source" table as well? I overlooked it. But ORDER BY what? We'd like to sort rows so that we read them block by block, never returning to previous block again, right? I think, FULL SCAN access will do that.

egorius 2009-12-04 09:00:57

Answer 6

A:

If you are doing the delete from the source before the t1/t2 deletes, that suggests you don't have referential integrity constraints (as otherwise you'd get errors saying child records exist).

I'd go for creating the constraint with ON DELETE CASCADE. Then a simple

DECLARE
  v_cnt NUMBER := 1;
BEGIN
  WHILE v_cnt > 0 LOOP
   DELETE FROM source WHERE import_source = 'bad.txt' and rownum < 5000;
   v_cnt := SQL%ROWCOUNT;
   COMMIT;
  END LOOP;
END;

The child records would get deleted automatically.

If you can't have the ON DELETE CASCADE, I'd go with a GLOBAL TEMPORARY TABLE with ON COMMIT DELETE ROWS

DECLARE
  v_cnt NUMBER := 1;
BEGIN
  WHILE v_cnt > 0 LOOP
   INSERT INTO temp (id)
   SELECT id FROM source WHERE import_source = 'bad.txt' and rownum < 5000;
   v_cnt := SQL%ROWCOUNT;
   DELETE FROM t1 WHERE id IN (SELECT id FROM temp);
   DELETE FROM t2 WHERE id IN (SELECT id FROM temp);
   DELETE FROM source WHERE id IN (SELECT id FROM temp);
   COMMIT;
  END LOOP;
END;

I'd also go for the largest chunk your DBA will allow. I'd expect each transaction to last for at least a minute. More frequent commits would be a waste.

This is happening on a system that can't afford to have the tables locked for that long.

Oracle doesn't lock tables, only rows. I'm assuming no-one will be locking the rows you are deleting (or at least not for long). So locking is not an issue.

Gary 2009-12-03 21:48:57

ansaurus

tags:

views:

answers:

SQL optimization question (oracle)

related questions