ansaurus

Question

Answer 1

+3 A:

The first thing I'd recommend is to do a simple insert-select statement, and let the database handle all the data movement. Not so useful if you're moving data between two machines, or if you don't have rollback segments large enough to handle the entire query.

The second thing I is to learn about the addBatch() method. As you've written your code, it makes a round-trip to the database for every row that you're inserting, which adds network overhead.

Third, unless you already have lots of rows in the destination tables is to drop any indexes before your inserts, and recreate afterward. If you leave the indexes in place, they have to be updated for every row, adding to the dirty block overhead.

And finally: do you need clustered tables? My experience has been that they don't buy you a lot (caveat: that experience was on a single tablespace).

kdgregory 2008-12-13 13:46:22

hey thanks for your advice. I'm using Java to monitor the process of the data transfer (how many rows left), a normal INSERT AS SELECT doesn't tell me. I'm always accessing the tables in the same way, like select * from T1 where word='foo'; for that hash tables are best I think.

chris 2008-12-15 02:36:39

Based on past experience with hash-indexed databases (Teradata), I would turn to hash clusters only if you're joining multiple tables on the same key -- and as noticed, I didn't see much benefit when I tried that. For your query, a normal B-tree index is probably best.

kdgregory 2008-12-15 16:01:33

If you use INSERT INTO .. SELECT FROM syntax you can use v$longops to see where it's up to.

WW 2009-08-13 14:59:36

+1 except that hash clustering used properly is a real saver. One logical read to find all the realted records from multiple tables could be an order of magnitude better than alternatives.

David Aldridge 2009-08-13 21:46:58

Answer 2

A:

Hi,

Unless you have some special reason to handle data in app, I would go for direct INSERT AS SELECT. Using Parallel DML can give you tremendous difference.

Check also INSERT ALL syntax (1 read for 2 writes) if that fits your needs.

Unless you have IO problems, 1h should be more than enough...

Regards

2008-12-13 23:02:51

Answer 3

+1 A:

Well, you can't call a table RAW in Oracle -- it's a reserved word so an ORA-00903 error will be raised.

That aside, you would use:

insert all
into t1
into t2
select * from RAW
/

"Row-by-row equals slow-by-slow" :)

David Aldridge 2008-12-15 14:07:14

Answer 4

A:

Conceptually similar to addBatch, you could write a PL/SQL procedure that accepts arrays of (word, doc, count) and processes the inserts on the server side. It is conceptually similar since you are reducing network trips by sending up multiple records in one shot and you may achieve faster performance. On the other hand, it is more complicated and brittle since it requires writing the PL/SQL on the server side and will require additional array logic on the client side. Oracle TechNet has a few examples of this.

//Nicholas

Nicholas 2009-01-05 17:47:06

ansaurus

tags:

views:

answers:

Faster Insert Oracle Hash Cluster Table

related questions