views:

678

answers:

4

We have a 16 processor SQL Server 2005 cluster. When looking at CPU usage data we see that most of the time only 4 of the 16 processors are ever utilized. However, in periods of high load, occasionally a 5th and 6th processor will be used, although never anywhere near the utilization of the other 4. I'm concerned that in periods of tremendously high load that not all of the other processors will be utilized and we will have performance degradation.

Is what we're seeing standard SQL Server 2005 cluster behavior? I assumed that all 16 processors would be utilized at all times, though this does not appear to be the case. Is this something we can tune? Or is this expected behavior? Will SQL server be able to utilize all 16 processors if it comes to that?

+1  A: 

Are you sure that you're not bottlenecking elsewhere? On IO perhaps?

spender
I should clarify - we don't have a bottleneck at this point. I just want to make sure that when we do see high load (as there are some new database intensive features being added to the application) that all of the processors will be used.
Eric
+1  A: 

Hard to be sure without hard data, but I suspect the problem is that you're more IO-bound or memory-bound than CPU-bound right now, and 4 processors is enough to keep up with your real bottleneck.

My reasoning is that if there were some configuration problem that was keeping you limited to 4 cpus, you wouldn't see it spill over to the 5th and 6th processors at all.

Joel Coehoorn
I would agree, I really just want to know if this is standard operating procedure for SQL Server. That is, does it slowly bring more processors in to the fray as usage increases? I thought the load would be evenly shared, so I just wanted to confirm that the behavior we're seeing is by design.
Eric
Some people license SQL Server per cpu, so this behavior is desirable to save them money.
Joel Coehoorn
Yeah, I had originally considered that, though as far as I know we're using an unlimited CPU license (wonder how much that costs? :)
Eric
+2  A: 

Even accounting for IO bottleneck I would check is whether you have processor affinities set up, what your maxdop setting is, whether it is SMP or NUMA which should also affect what maxdop you may wish to set.

when you say you have a 16 processor cluster, you mean 2 SQL servers in a cluster with 16 processors each, or 2 x 8 way SQL servers?

Andrew
Thank you for the response! I'm going to find out what our MAXDOP setting is. Also, by cluster I mean 2 physical servers each with 16 processors.
Eric
You should not see 4 of the procs hit high cpu whilst the others are sitting idle unless you are running a small number of queries that can not parallel and are very expensive.If you were running a lot more / the query was parallelised, the load would be more balanced.
Andrew
The bulk of our queries are quite fast, though there are a couple slow offenders. So you would expect, for a high number of relatively fast queries, that we would be seeing more utilization from the other processors?
Eric
I would possibly expect it to be more balanced, but it depends on how many were running simultaneously and durations, if you only have 4 running at once with no parallelism it will only hit 4 cpu's at once. really depends on the queries / parallelism / workload and without specifics it is not possible to be definative.
Andrew
I was afraid you'd say you needed specifics :) Unfortunately that's going to require some work to get from the DBAs - but thank you for the advice up to this point!
Eric
+3  A: 

I'll consider you did due diligence and validated that the CPU consumption belongs to the sqlservr.exe process, so we're not chasing a red herring here. If not, please make sure the CPU is consumed by sqlservr.exe by checking the Process\% Processor performance counters.

You need to understand the SQL Server CPU scheduling model, as described in Thread and Task Architecture. SQL Server spreads requests (sys.dm_exec_requests) across schedulers (sys.dm_os_schedulers) by assigning each requests to task (sys.dm_os_tasks) that is run by a worker (sys.dm_os_workers). A worker is backed by an OS thread or fiber (sys.dm_os_threads). Most requests (a batch sent to SQL Server) spawn only one task, some requests though may spawn multiple tasks (parallel queries being the most notorious).

The normal behavior of SQL Server 2005 scheduling should be to distribute the tasks evenly, across all schedulers. Each scheduler corresponds to one CPU core. The result should be an even load on all CPU cores. But I've seen the problem you describe a few times in the labs, when the physical workload would distribute unevenly across only few CPUs. You have to understand that SQL Server does not control the thread affinity of its workers, but instead relies on the OS affinity algorithm for thread locality. What that means is that even if SQL Server spreads the requests across the 16 schedulers, the OS might decide to run the threads on only 4 cores. In correlation with this issue there are two problems that may cause or aggravate this behavior:

  • Hyperthreading. If you enabled hyperthreading, turn it off. SQL Server and hyperthreading should never mix.
  • Bad drivers. Make sure you have the proper system device drivers installed (for things like main board and such).

Also make sure your SQL 2005 is at least at SP2 level, prefferably at latest SP and all CU applied. Same goes for Windows (do you run Windows 2003 or Windows 2008?).

In theory the behavior could also be explained by a very peculiar workload, ie. SQL sees only few very long and CPU demanding requests that have no parallle option. But that would be an extremly skewed load and I never seen something like that in real life.

Remus Rusanu
This is great! Thanks so much for these links! I was looking all over the place for a comprehensive CPU scheduling diagram or something.
Eric
You'll find a lot of internals described in Slava's blog at http://blogs.msdn.com/slavao/
Remus Rusanu
I'm thinking at this point that as you say, perhaps the OS is actually causing the problem, and it's only assigning threads to four of the processors.
Eric
Check sys.dm_os_schedulers. Do all 16 'visible' schedulers (ignore the hidden ones and DAC) get an even amount of workload? Look at `load_factor` primarily, but also at active_worker_count, current_task_count, context_switches_count. If they fairly even, then the problem is likely at OS level.
Remus Rusanu
Oh, and another one, just a shot in the dark: is your server NUMA (Opteron?)? OR did the DBa enable Soft-NUMA? Make sure you don't have the server listen on multiple ports and affinitize clients to NUMA nodes.
Remus Rusanu