I have a pretty large table: 20+ million rows and I need to update about 5% of that - or 1 million rows.
Unfortunately, I am updating the (int) column that is being used as the clustered index.
My question is: What is the fastest way to update these rows?
I have tried updating the rows directly:
update t1
set t1.groupId = t2.groupId
from
table t1
join newtable t2 on t1.email = t2.email
but this takes WAY too long (I stopped it after 3 hours)
I assume that this is because the entire row (which has 2 datetimes, 2 varchars, and 2 ints) is being moved around for each update.
What if I dropped the clustered index first, then did the updates, then recreated the clustered index? Would that be faster?
Note: I have a nonclustered index on email, in case anyone thinks it's the select part of the query that is slow. It's not.