ansaurus

Question

How do I UPDATE a large table in oracle pl/sql in batches to avoid running out of undospace?

Answer 1

A:

I do this by mapping the primary key to an integer (mod n), and then perform the update for each x, where 0 <= x < n.

For example, maybe you are unlucky and the primary key is a string. You can hash it with your favorite hash function, and break it into three partitions:

UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=0
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=1
UPDATE myTable SET a=doMyUpdate(a) WHERE MOD(ORA_HASH(ID), 3)=2

You may have more partitions, and may want to put this into a loop (with some commits).

Kyle Lahnakoski 2010-08-22 20:45:22

One should use the NTILE() analytic function if one wants evently sized sets; ORA_HASH can have unpredictable values, especially when using a value that isn't a power of 2 for the number of buckets to hash into. ORA_HASH(n, 3) can have 4 values, so your example would have missed updating about 1/4 of the data.

Adam Musch 2010-08-23 14:50:58

Adam: Please note I used "MOD(ORA_HASH(ID), 3)", not "ORA_HASH(ID, 3)". I deliberately used MOD because the extra parameters for ORA_HASH are confusing.Thanks for the NTILE() reference. I am still not fully familiar with analytics.

Kyle Lahnakoski 2010-08-23 16:10:19

Answer 2

+2 A:

If you are going to update every row in a table, you are better off doing a Create Table As Select, then drop/truncate the original table and re-append with the new data. If you've got the partitioning option, you can create your new table as a table with a single partition and simply swap it with EXCHANGE PARTITION.

Inserts require a LOT less undo and a direct path insert with nologging (/+APPEND/ hint) won't generate much redo either.

With either mechanism, there would probably sill be 'forensic' evidence of the old values (eg preserved in undo or in "available" space allocated to the table due to row movement).

Gary 2010-08-22 23:03:13

Don't forget to disable foreign key constraints referencing the table.

Codo 2010-08-23 21:04:04

Answer 3

A:

The following is untested, but should work:

declare
  number l_fetchsize  := 10000;
  cursor cur_getrows is
  select rowid, random_function(my_column)
    from my_table;

  type rowid_tbl_type      is table of urowid;
  type my_column_tbl_type  is table of my_table.my_column%type;

  rowid_tbl     rowid_tbl_type;
  my_column_tbl my_column_tbl_type;
begin

  open cur_getrows;
  loop
    fetch cur_getrows bulk collect  
      into rowid_tbl, my_column_tbl 
      limit l_fetchsize;
    exit when rowid_tbl.count = 0;

    forall i in rowid_tbl.first..rowid_tbl.last
      update my_table 
         set my_column = my_column_tbl(i)
       where rowid     = rowid_tbl(i);
    commit;
  end loop;
  close cur_getrows;
end;
/

This isn't optimally efficient -- a single update would be -- but it'll do smaller, user-tunable batches, using ROWID.

Adam Musch 2010-08-23 14:35:20

ansaurus

tags:

views:

answers:

How do I UPDATE a large table in oracle pl/sql in batches to avoid running out of undospace?

related questions