views:

627

answers:

6

We have been using temporary table to store intermediate results in pl/sql Stored procedure. Could anyone tell if there is a performance difference between doing bulk collect insert through pl/sql and a plain SQL insert.

Insert into [Table name] [Select query Returning huge amount of data]

or

Cursor for [Select query returning huge amount of data]

open cursor

fetch cursor bulk collect into collection

Use FORALL to perform insert

Which of the above 2 options is better to insert huge amount of temporary data?.

+2  A: 

Insert into select must certainly be faster. Skips the overhead of storing the data in a collection first.

Rene
+3  A: 

It depends on the nature of the work you're doing to populate the intermediate results. If the work can be done relatively simply in the SELECT statement for the INSERT, that will generally perform better.

However, if you have some complex intermediate logic, it may be easier (from a code maintenance point of view) to fetch and insert the data in batches using bulk collects/binds. In some cases it might even be faster.

One thing to note very carefully: the query plan used by the INSERT INTO x SELECT ... will sometimes be quite different to that used when the query is run by itself (e.g. in a PL/SQL explicit cursor). When comparing performance, you need to take this into account.

Jeffrey Kemp
Have updated my question with couple of comments. Thanks btw for your response.
Prakash
A: 

I Suggest using PL\SQL explicit cursor, u r just going to perform any DML operation at the private workspace alloted for the cursor. This will not hit the database server performance during peak hours

uma
-1 you'll need to explain this a bit more clearly - what do you mean by "private workspace alloted for the cursor" and how do "peak hours" enter into this?
Jeffrey Kemp
Using a bulk collect for a large data volume will certainly use PGA memory, which is ultimately memory that won't be available for other processes on the server.
Gary
A: 

When we declare cursor explicitly, oracle will allocate a private SQL work area in our RAM. When you have select statement that returns multiple rows will be copied from table or view to private SQL work area as ACTIVE SET. Its size is the number of rows that meet your search criteria. Once cursor is opened, your pointer will be placed in the first row of ACTIVE SET. Here you can perform DML. For example if you perform some update operation. It will update any changes in rows in the work area and not in the table directly. So it is not using the table every time we need to update. It fetches once to the work area, then after performing operation, the update will be done once for all operations. This reduces input/output data transfer between database and user.

uma
+4  A: 

Some experimental data for your problem (Oracle 9.2)

bulk collect

DECLARE 
  TYPE t_number_table IS TABLE OF NUMBER;
  v_tab t_number_table;
BEGIN
  SELECT ROWNUM
  BULK COLLECT INTO v_tab
  FROM dual
  CONNECT BY LEVEL < 100000;

  FORALL i IN 1..v_tab.COUNT
    INSERT INTO test VALUES (v_tab(i));
END;
/
-- 2.6 sec

insert

-- test table 
CREATE global TEMPORARY TABLE test (id number)
ON COMMIT preserve ROWS;

BEGIN
  INSERT INTO test
  SELECT ROWNUM FROM dual
  CONNECT BY LEVEL < 100000;
END;
/
-- 1.4 sec

direct path insert http://download.oracle.com/docs/cd/B10500_01/server.920/a96524/c21dlins.htm

BEGIN
  INSERT /*+ append */ INTO test
  SELECT ROWNUM FROM dual
  CONNECT BY LEVEL < 100000;
END;
/
-- 1.2 sec
jva
+1. Thanks for backing up my suspicions with some prove.
Rene
+1 for clear and tested data (and for using dual :) )
Unreason
A: 

Tom Kyte of asktomhome fame has answered this question more firmly. If you are willing to do some searching you can find the question and his response which constains detailed testing results and explanations. He shows plsql cursor vs. plsql bulk collect including affect of periodic commit, vs. sql insert as select.

insert as select wins hands down all the time and the difference on even modest datasets is dramatic.

That said. the comment was made earlier about the complexity of intermediary computations. I can think of three situations where this would be relevant.

1) If computations require going outside of the Oracle database, then clearly a simple insert as select does not do the trick.

2) If the solution requires the use of PLSQL function calls then context switching can potentially kill your query and you may have better results with plsql calling plsql functions. PLSQl was made to call SQL but not the other way around. Thus calling PLSQL from SQL is expensive.

3) If computations make the sql code very difficulty to read then even though it may be slower, a plsql bulk collect solution may be better for these other reasons.

Good luck.

Kevin Meade