views:

1342

answers:

3

I have the following UPDATE scenario:

UPDATE destTable d
SET d.test_count = ( SELECT COUNT( employee_id )
                     FROM sourceTable s
                     WHERE d.matchCode1 = s.matchCode1 AND
                           d.matchCode2 = s.matchCode2 AND
                           d.matchCode3 = s.matchCode3 
                     GROUP BY matchCode1, matchCode2, matchCode3, employee_id )

I have to execute this in a loop changing out the match codes for each iteration.

Between two large tables (~500k records each), this query takes an unacceptably long time to execute. If I just had to execute it once, I wouldn't care too much. Given it is being executed about 20 times, it takes way too long for my needs.

It requires two full table scans (one for the destTable and another for the subquery).

Questions:

  1. What techniques do you recommend to speed this up?

  2. Does the SQL-optimizer run the subquery for each row I'm updating in the destTable to satisfy the where-clause of the subquery or does it have some super intelligence to do this all at once?

A: 

Have you considered an UPDATE FROM query?

JayTee
How is that different (excuse my ignorance)? Does it work in Oracle?
j0rd4n
Oracle does not support UPDATE FROM. You can update a subquery, but this specific one is not updatable (it's not key-preserved)
Quassnoi
+3  A: 

In Oracle 9i and higher:

MERGE   
INTO    destTable d
USING   (
        SELECT  matchCode1, matchCode2, matchCode3, COUNT(employee_id) AS cnt
        FROM    sourceTable s
        GROUP BY
                matchCode1, matchCode2, matchCode3, employee_id
        ) so
ON      d.matchCode1 = s.matchCode1 AND
        d.matchCode2 = s.matchCode2 AND
        d.matchCode3 = s.matchCode3 
WHEN MATCHED THEN
UPDATE
SET     d.test_count = cnt

To speed up your query, make sure you have a composite index on (matchCode1, matchCode2, matchCode3) in destTable, and a composite index on (matchCode1, matchCode2, matchCode3, employee_id) in sourceTable

Quassnoi
Unfortunately, the 'Explain Plan' analyzer says the merge will be more costly compared to the simple update statement.
j0rd4n
Just because the plan's predicted cost comes out higher doesn't mean it will actually take longer. Try actually executing it.
Dave Costa
Your initial query actually forces an execution plan similar to a nested loop. In my query, the optimizer has a whole range of algorithms to select from.
Quassnoi
WOW! This did the trick! One iteration of my original update took way over a minute. You're merge query brought the time down to sub-second! Very cool. Thanks for the help!
j0rd4n
...and that was without an index.
j0rd4n
Imagine what an index will do :)
Quassnoi
+3  A: 

I have to execute this in a loop

The first thing you do is build the loop into your sub query or where clause. You're updating data, and then immediately replacing some of the data you just updated. You should be able to either filter your update to only change records appropriate to the current iteration or make your query complex enough to update everything in one statement- probably both.

Joel Coehoorn
Well, I'm updating sections of a large summary table. Each iteration of the loop updates a section of the larger summary table.
j0rd4n
Your sample shows no where clause, though. Therefore as far as we can see each update will hit every record in the table, even if it it ends up with the same value.
Joel Coehoorn
I'm summarizing role assignments for employees. Each role needs to report the number of assignments given the 3-4 value match specified in the question.
j0rd4n
Oh, I see what you are saying. I left off the WHERE but now that I'm thinking about it, it isn't as robust as it probably could be. I'll take a second look at that.
j0rd4n