views:

702

answers:

4

We have 2 tables called TableToUpdate and Dates. We need to update TableToUpdate's EndTimeKey column by looking from other table Dates. We run sql below to do this, but it takes to long to finish.

Table TableToUpdate has 6M records. Table Dates has 5000 records.

How can we optimize it ?

Thanks for replies !

update TableToUpdate set
EndTimeKey = DATE_NO
from Dates where EndTime = DATE
A: 

you can set some indices on the relevant fields (in relevant order) i.e.: endtimekey and endtime. do not expect to much of this. and the other thing you could check is if you have other constraints to limit the query-results.

You could also create a view that returns for every tabletoupdate.endtimekey the correct date_no.

Maybe you could write a stored-procedure if your dbms does support such stuff - because this would really accelerate the update.

Gambrinus
A: 

I note a couple of things here, is EndTimeKey really a key? If so it may have an index on it, if so the speed (or lack thereof) will be updateing the index whilst also doing the actual update of the data, solution drop the index, run the update re-apply the index.

Another issue could be the transactional nature of Sql - as you do this update it will log every change so it can roll back in the event of a failure. This update looks to be pretty straightforward, so you could apply it in batches ie

update TableToUpdate setEndTimeKey = DATE_NOfrom Dates where EndTime = DATE
where TableToUpdateId between 1 and 100000

That will break your update into manageable size chunks - at the very least you'll get an idea how long each chunk will take.

Another option is putting an index on the EndTime column, potentially it's having to do a full table scan.

The real answer though is to look at the query plan being generated. As you can see there are many reasons why a query may run slow - these are just some quick ones to check.

MrTelly
+2  A: 

You are updating potentially 6 million records, this is not going to be terribly speedy in any event. However, look at your execution plan and see if it using indexes.

Also run this in batches, that is generally faster when updating large numbers of records. Do the update during off hours when there is little load on the database, this will reduce potential locking issues. Make sure your datatypes are the same between the two tables so you aren't having to do any implicit conversions.

Look at the table you are updating, are there any triggers on it? Depending on how the trigger is written, this could seriously slow down an update of that many records (especially if someone who wasn't too bright decided to put a cursor or a loop in the trigger instead of writing set-based code).

Also here is some thing I would add (I also changed it show explicitly show the join)

update t
set EndTimeKey = DATE_NO
from TableToUpdate t
Join Dates D on t.EndTime = d.DATE
where EndTimeKey <> DATE_NO

No point in updating records that already match.

HLGEM
A: 

With this volume of data you might be best creating a SELECT query which produces a resultset, complete with updated values, as you would like to see the new table. Next, SELECT these into new table (perhaps 'NewTableToUpdate') , either by creating a the table and using INSERT INTO or by changing your SELECT adding an INTO to create the new table.

Next use sp_rename to rename 'TableToUpdate' to 'OLDTableToUpdate' and 'NEWTableToUpdate' to 'TableToUpdate' and then create the indexes as you had them on the original table.

In my experience I've found this to be the quickest means of acheiving big changes like this. HTH.

Extra thought... if you have a clustered index on your table then add an ORDER BY to your SELECT statement to ensure it is inserted into your new table in the same sequence as the clustered index. That'll speed up the index creation in a significant way.

Elliveny