views:

91

answers:

7

I have a cursor that selects all rows in a table, a little over 500,000 rows. Read a row from cursor, INSERT into other table, which has two indexes, neither unique, one numeric, one 'DATE' type. COMMIT. Read next row from Cursor, INSERT...until Cursor is empty.

All my DATE column's values are the same, from a timestamp initialized at the start of the script.

This thing's been running for 24 hours, only posted 464K rows, a little less than 10K rows / hr.

Oracle 11g, 10 processors(!?) Something has to be wrong. I think it's that DATE index trying to process all these entries with exactly the same value for that column.

+2  A: 

Indexes slow down inserts but speed up queries. This is normal.

If it is a problem you can remove the index, insert the rows, then add the index again. This can be faster if you are doing many inserts at once.

The way you are copying the data using cursors seems to be inefficient. You could try a set-based approach instead:

INSERT INTO table1 (x, y, z)
SELECT x, y, z FROM table2 WHERE ...
Mark Byers
can't remove index. Production table, in heavy use, 100's of millions of rows.
Kelley
Better to mark the index as unusable than to drop and recreate it, as that would avoid unintended consequences from recreating the index with different options.
Adam Musch
It's the half-million rows with same index value that I'm curious about. I am told a) this makes no difference or b) this is terrible, never do an index on a date column in Oracle. I'm leaning towards the latter being true based on the observed performance...
Kelley
@Kelley: Have you considered that the slowness of the inserts could be due to the "heavy use" of the server?
Mark Byers
If it was running that slow, every phone in my department would be ringing off the hook. HAd the dba review the system, all other processes are happy. Not locks, no bottlenecks, no ugly plan...
Kelley
(b) is untrue - there are lots of good reasons to have an index on a date column. What is bad is having a b-tree index on something where you have 500,000 entries with the same value (relatively low cardinality). That sounds more like a binary index, or part of a compound index with better cardinality. That said, on a table of hundreds of millions of rows, 500000 could be considered low cardinality.
JulesLt
To add to JulesLt comment, if OP changes the column to a bitmap index then OP should definitely move to set based versus row by row processing, since bitmap indexes don't handle small changes well.
Shannon Severance
A: 

In general, its often a good idea to delete the indexes before doing a massive insert and then add them back afterwards, so that the db doesnt have to try to update the indexes with each insert. Its been a long while since I've used oracle, but had you tried putting more than one insert statement in a transaction? That should also speed it up.

GrandmasterB
+3  A: 

Why don't you just do:

insert into target (columns....) 
select columns and computed values 
from source

commit

?

This slow by slow is doing far more damage to performance than an index that may not make any sense.

Shannon Severance
+1  A: 

Committing after every inserted row doesn't make much sense. If you're worried about exceeding undo capacity, for example, you can keep a count of the inserts and issue a commit after every thousand rows.

Updating the indexes will have some impact but that's unavoidable if you can't drop (or disable) while the inserts are performed, but that's just how it goes. I'd expect the commits to have a bigger impact, though I suspect that's a topic with varied opinions.

This assumes you have a good reason for inserting from a cursor rather than as a direct insert into ... select from model.

Alex Poole
This job was stopped, then restarted with 125006 rows to process.I rewrote the process to change three things.1) COMMIT every 100 INSERTs2) Cursor close / open every 10,000 rows3) Change the time on the transaction_dt column by 1 second for each 10,000-row cursor cycle. Process started at 14:49:46First 10000 went fast, but then noticed slowdown, so started monitoring. Every 10K rows ran slower than the prior 10K. From 20K to 30K took 2.39, 60to70 took 7.54, and 110to120 took 12.52. Almost done. Opening a ticket with Oracle.
Kelley
If there's an impact on the index it may come from the skewing of having so many values close together, unbalancing the tree - not particularly because they're dates, though it'd be interesting if that's worse than other datatypes for some reason. Does the cursor refer to the table you're inserting into - I assume you've checked the cursor performance on its own, without the inserts?
Alex Poole
Cursor orgininally made no reference to the table for INSERT, but now makes a select that looks there avoid already-processed rows.
Kelley
SELECT * FROM ec_acq WHERE SUBSCRIBER_ID NOT IN ( SELECT SUBSCRIBER_ID FROM CONTACT_HISTORY WHERE TRANSACTION_DT BETWEEN TO_DATE('09/07/2010 14:17:06','MM/DD/YYYY HH24:MI:SS') AND SYSDATE AND CONTACT_CD = 'EN' AND DESCRIPTION = 'blah blah blah');
Kelley
The number of already inserted rows will be increasing as you go along.
JulesLt
A: 

Can you try Bulk Binding ?? I think it might help..

Pravin Satav
A: 

For operations like this you should look at oracle bulk operations, using FORALL and BULK COLLECT. It will reduce the number of DDL operations on the underlying tables considerably

create or replace procedure fast_proc is
    type MyTable is table of source_table%ROWTYPE;
    MyTable table;
    begin
        select * BULK COLLECT INTO table from source_table;

         forall x in table.First..table.Last
             insert into dest_table values table(x) ;
    end;
Visage
I want to thank everyone here for their answers. Especially since I have another effort, exactly the same, which has resulted in 2.7 Million rows to INSERT.
Kelley
It's amazing what will make you add to your knowledge base, or remind you of it. I will experiment with the comparison of INSERT from a SELECT as opposed to using a CURSOR. I'll see if I can get around the indexing overhead of everything having the same timestamp (thanks for reminding me of exactly why I thought that was a problem...rebalancing index trees.) Changing from 1 to 100 rows per COMMIT didn't much improve things. And I will read up on the 'BULK' facilities, on DBMS_PARALLEL_EXECUTE (cool!) and on the /*+APPEND*/ hint. I know about hints, I just forget they're there. Again Thanks
Kelley
A: 

Agreed on comment that what is killing your time is the 'slow by slow' processing. Copying 500,000 rows should be a matter of minutes.

The single INSERT ... SELECT FROM .... approach would be the best one, provided you have big enough Rollback segments. The database may even automatically apply parallel techniques to a plain SQL statement that it will not do with PL/SQL.

In addition you could look at using the /*+ APPEND */ hint - read up on it and see if it may apply to the situation with your target table.

o use all 10 cores you will need to either use plain parallel SQL, or run 10 copies of your pl/sql block, splitting the source table across the 10 copies.

In Oracle 10 this is a manual task (roll your own parallelism) but Oracle 11.2 introduces DBMS_PARALLEL_EXECUTE.

Failing that, bulking up your fetch / insert using the BULK COLLECT & bulk insert would be the next best option - process in chunks of 1000 or so rows (or larger). Again take a look as to whether DBMS_PARALLEL_EXECUTE may help you, or if you could submit the job in chunks via DBMS_JOB.

(Caveat : I don't have access to anything later than Oracle 10)

JulesLt