ansaurus

Question

Update statement running for too long or not

Answer 1

A:

Are there any indexed views referencing that field?
Is that field published in a (transactional) replication scheme?
Are other sessions accessing that table at the same time the update is running?
Are the log and data files stored on separate disks (physical, not two different partitions of the same hardware)?
Are there any check constraints referencing that field?
Are there any triggers on that table?

All of these, and likely many other factors, will affect data modification performance.

Otherwise, try batching the update using TOP and a WHERE clause that finds only umodified rows.

Daniel Renshaw 2010-08-19 16:22:11

No, no, no, no, no, no. :)

subt13 2010-08-19 16:28:02

I tried batching, but wasn't sure about the batch size. I tried 10000 and it was taking even longer.

subt13 2010-08-19 16:28:30

Answer 2

+11 A:

IF you updated 20M rows in one single transaction, then your time was entirely driven by your IO subsystem: what kind of drives you have, what disk files layout etc. If you have 40 spindles in raid 10 with 4 balanced files and a separate similar battery for the log then the result is worryingly slow. If you tested this with one single MDF that shares the spindle with the LDF on a single consumer quality 5000rpm HDD then your time is amazingly fast.

Remus Rusanu 2010-08-19 16:26:06

+1: Is IO always the issue for cases like these, or is there a vague tipping point?

OMG Ponies 2010-08-19 16:28:05

For 20M rows in a single update statement is all going to be driven by how fast the log can flush, with some spikes when the DB checkpoints kick in due to buffer pool pressure for free pages.

Remus Rusanu 2010-08-19 16:29:49

The tell-tale sign is going to be the `Avg. Disk Queue Length` (>2 per spindle) and the `Avg. sec/transfer` (>10ms) performance counters. http://msdn.microsoft.com/en-us/library/ms175903.aspx

Remus Rusanu 2010-08-19 16:34:06

Good points. This is a server OS running a server application using desktop hardware. No RAID, single 250 GB 7200 drive. That's what I needed to know.

subt13 2010-08-19 16:35:13

@Remus regardless of the recovery mode setting (simple vs full/bulk)?

subt13 2010-08-19 16:45:57

@subt13: Even in simple mode, the transaction has to log every row change so that it can do a rollback. 20M rows-> 20M change records in log + space reserved for 20M undo operations, it need log space and every bit has to be written to disk. The recovery model would play a role only if you had multiple transactions, but it would not make things faster no matter the recovery model. Only bulk *may* be faster for certain *insert* operations, never for updates.

Remus Rusanu 2010-08-19 16:52:45

Based on this feedback it seems like the 25 minutes is acceptable and if I want to speed it up, I'm going to have to upgrade the hardware.

subt13 2010-08-19 17:14:46

You may give disk partition alignment a shot: http://msdn.microsoft.com/en-us/library/dd758814%28SQL.100%29.aspx

Remus Rusanu 2010-08-19 18:01:29

Answer 3

+2 A:

You are updating 20 Mio records in about 1500s averaging at somrthing of 7000 updates per second. Sounds about right.

Peter Tillemans 2010-08-19 16:27:33

Just curious as to why that sounds about right?

subt13 2010-08-19 16:40:31

because the order of magnitude is in the range I would expect, less than 1000 would be strange more than 40000 also.

Peter Tillemans 2010-08-19 16:58:01

ansaurus

tags:

views:

answers:

Update statement running for too long or not

related questions