ansaurus

Question

SQL Server stops processing for 20 seconds

Answer 1

A:

Are you using fulltext search?

I'm thinking that there may be some index rebuilding going on every now and then.

Perhaps try automating full rebuilds of indexes, or changing to non clustered indexes?

rizzle 2009-12-15 22:00:17

Thanks Rizzle. However, I'm not using full text search.

Mike Brad 2009-12-15 22:22:00

Answer 2

A:

I would add in a few more counters in your perfmon, like maybe reads and writes per second. From here you can see if it's an I/O issue. Also check out this MSDN entry on SQL performance. It really gave some good ideas on things to check out for me at least.

Tim Meers 2009-12-15 22:00:25

I guess I am in bad shape. % Disk Avg 633 (can't explain that). Avg disk sec/read .042 Avg disk sec/write .052disk reads/sec 2.041disk writes/sec 71It is a parity raid, but I think those numbers are out of the ballpark. would you agree?

Mike Brad 2009-12-15 22:20:56

Well not knowing your RAID level and number of disks it's hard to say if disk IO is the issue. I have a RAID 5 array with 4 disks so I'd use this to calculate the IOPS: reads + (4 * Writes)) / Number of disks = total IO/s. Under my typical load with numbers punched it it looks like this: (724.364 + (4 * 5.707)) / 4 = 186.798. I have a lot more reads than writes, but you seem to have a lot of writes, but nothings to terrible, like Chris said might be an issue with the array. I'd check that before spending any time on the code.

Tim Meers 2009-12-16 13:49:22

But then again, I normally look at the hardware first since I'm better at the server side of things than coding.

Tim Meers 2009-12-16 13:55:21

Answer 3

+2 A:

Have you checked the drive for errors? It sounds like maybe there is something going on. If it's a RAID array, check the health of the array.

Chris Lively 2009-12-15 22:05:18

Will do (I will put the ISM on it). Thanks.

Mike Brad 2009-12-15 22:37:22

Answer 4

A:

what is the wait_type, wait_resource and wait_time of sys.dm_exec_requests for the long running requests (sample periodically)? Do these requests spawn sub tasks (sys.dm_os_tasks)? What are those tasks doing?

Remus Rusanu 2009-12-15 22:05:28

Generally for processes that don't look like system processes, waittype is null and waittime is 0. During one of the incidents I queried dm_exec_requests and did see one transaction with OLEDB (waittime 15) and one with waittype WRITELOG (waittime 0). I will have to research what this means.Not sure what to look for in dm_os_tasks

Mike Brad 2009-12-15 22:36:45

WRITELOG means the requests has commited a transaction and is waiting for the log to be hardened (written to disk). OLEDB is a distributed query wait. In sys.dm_os_tasks you should look for task_state. PENDING would indicate a scheduler bottleneck (all workers are occupied)

Remus Rusanu 2009-12-15 22:59:47

Mike Brad 2009-12-16 15:12:42

Answer 5

A:

Have you checked your memory consumption? Windows Server 2003 R2 sometimes basically restarts all memory allocations under intense load. When this happens, SQL Server is forced down to a minimal amount of memory (4MB or so) and then slowly reallocate memory to the server until it returns to relatively normal levels. We've seen this happen when very large files are copied across our SAN. I've heard this can be triggered by a transaction log backup process if the transaction logs are very large and the server is under extremely heavy usage.

Registered User 2009-12-15 22:26:36

Looking at task manager (not sure that is the best way) I see the Sqlservr.exe process reporting about 2,544,000 mem usage. It fluctuates a little but never drops considerably (even through an incident).

Mike Brad 2009-12-15 22:41:14

Answer 6

A:

Mike, See my blog post Unexplained SQL Server Timeouts and Intermittent Blocking. Especially if your stored proc has a "SELECT INTO" or deletes from a temp table.

Jim

JBrooks 2009-12-15 22:39:30

As a rule we use table variables (not temp tables) which are defined before any data is inserted into them. I will sift through the whole process and check again to be sure.

Mike Brad 2009-12-15 22:44:51

Answer 7

A:

It's not slow code because the delay doesn't increase the CPU time. It sounds like the server is making a blocking call that is not succeeding, and then it eventually times out. You've ruled out deadlocks. If it was a hard drive problem, you'd expect to see something in the event log.

Try installing a network sniffer such as Wireshark to see if there is anything interesting happening at the time the pause begins.

jdigital 2009-12-15 23:35:36

Answer 8

A:

One option: statistics update. If you're writing often enough, you may hit the recompute threshold.

Look at this article "Index Statistics on MSDN" and the option "AUTO_UPDATE_STATISTICS_ASYNC"

Although every 90 seconds is a bit much...

gbn 2009-12-16 05:54:34

Answer 9

+1 A:

The issue is the automatic Checkpoint. When SQL server runs the automatic checkpoint, other transactions are delayed, this is probably related to the disk i/o involved in the checkpoint.

dm_exec_requests showing a waittype WRITELOG (waittime 0) means the requests has committed a transaction and is waiting for the log to be hardened (written to disk) --Remus Rusanu

To verify this, I turned on checkpoint logging, and recorded a perfmon session during several of the incidents. I then compared the log to the perfmon to see that the incidents were always related to checkpoint in one of my databases.

DBCC TRACEON(3502, -1) --turn on checkpoint logging

DBCC TRACEOFF(3502, -1) --turn off checkpoint logging

EXEC xp_readerrorlog --read the log

SELECT DB_Name([dbid]) as [Database Name] --verify the database id mentioned in the log

That particular database has one process that produces a lot of inserts and deletes. The solution is to re-write that process to reduce the amount of data being recorded. Another option would be to add hardware.

Thanks to all who contributed.

Mike Brad 2009-12-18 19:21:18

ansaurus

tags:

views:

answers:

SQL Server stops processing for 20 seconds

related questions