views:

185

answers:

2

I asked a question yesterday which got answers but didnt answer the main point. I wanted to reduce amount of time it took to do a MINUS operation.

Now, I'm thinking about doing MINUS operation in blocks of 5000, appending each iterations results to the cursor and finally returning the cursor. I have following:

V_CNT           NUMBER :=0;
V_INTERVAL      NUMBER := 5000;

begin
  select count(1) into v_cnt from TABLE_1
while (v_cnt > 0)
loop
open cv_1 for
    SELECT  A.HEAD,A.EFFECTIVE_DATE,
    FROM   TABLE_1 A
    WHERE  A.TYPE_OF_ACTION='6' AND A.EFFECTIVE_DATE >= ADD_MONTHS(SYSDATE,-15)  
    AND A.ROWNUM <= V_INTERVAL
    MINUS
    SELECT  B.head,B.EFFECTIVE_DATE,
    FROM  TABLE_2 B
    AND B.ROWNUM <= V_INTERVAL

V_CNT := V_CNT - V_INTERVAL;
END LOOP; 
end;

However, as you see...in each iteration the cursor is overwritten. How can I change the code so that in each iteration it appends to cv_1 cursor rather than overwriting?

A: 

That's not how cursors work, you would have to store the values in some sort of collection.

Your current query gets you 5000 random rows from Table_1 and removes rows that also exist in 5000 random rows selected from Table_2.


Have you tried doing it without the MINUS?

As I understand the query, it should produce the same as this one:

Select  a.head, a.effective_date,
From table_1 a
Left Join table_2 b On (b.head = a.head And b.effective_date = a.effective_date )
Where a.type_of_action='6' And a.effective_date >= ADD_MONTHS(SYSDATE,-15)  
  And b.head Is Null;

Having a compound index on TABLE_1 (type_of_action, head, effective_date) and on TABLE_2 (head, effective_date) should help you with performance.

Peter Lang
hmmm 5000 are random but eventually my aim is to cover the whole table right? which it does? the left join you provided does not return any results whereas the minus returns 60538 results..
Omnipresent
from your exp...does the rownum approach sound ok or since 5000 rows are selected randomly..it wont be consistent as w/out rownum? I can look into putting values in collection and lastly inserting them into cursor
Omnipresent
I guess my post should answer, why you are getting different results in both approaches.
The Machine
but without using rownum also I get 60538 records
Omnipresent
I don't know why my query does not return any row. It should return all rows of `Table_1` with `type_of_action='6' And a.effective_date >= ADD_MONTHS(SYSDATE,-15)` which do not exist in `Table_2`. What do you get when you remove the two lines refering to `b` (the join and the second condition)? The `ROWNUM` approach won't work that way. You could only select all rows of `Table_1` into a collection, and then remove records. Performance would suffer though. Do you have the indexes I suggested?
Peter Lang
You accepted my answer, so I guess your problem is solved :) Would you mind to share how you do it now?
Peter Lang
Well, the batch way is certainly a wrong way, see my example in question's comment. I guess he's validating your query's result against awrong resultset :(.
Michal Pravda
+1  A: 

You haven't stated the requirement clearly. So , i am assuming , you want to do a MINUS on two tables, A and B. i.e you want to find tuples in A that are not in B.

Assuming this , the logic that you have written is not completely correct, as you are doing a MINUS on corresponding (5000-length) batches of A and B.

Eg: Your logic will return a tuple in the 4000th row in table A, that is present in say the 6000th row of table B.

I suggest you use left-outer join to accomplish your need. (Same as Peter Lang's post). That should suffice for your performance requirements too, I think.

The Machine