views:

299

answers:

4

Our primary database server is an 8 core box with 8GB of RAM. The CPU is a Xeon E7330 @ 2.4GHz. It runs Windows Server 2003 R2 (x64 edition) and SQL Server 2005

I wanted to do some testing so I set up SQL Server 2005 on another brand-new server which is an 8 core box with 4 GB of RAM. It has a Xeon X5460 @ 3.16GHz and runs Windows Server 2003 R2 Standard. I Installed SQL Server 2005 out of the box and restored a backup of the primary database on to it, and did an UPDATE STATISTICS on all the tables.

The process I was testing executes the same stored proc many times. I was astounded to find from the profiler that this proc which executes with duration=0 or 1 on the primary server, was consistently executing with durations in excess of 130. This essentially makes the secondary server useless for testing, because it's just too slow.

No other apps run on either of these two boxes, just SQL server. And unlike the primary database server, the test server only had me accessing it.

I can't believe the difference in spec between these two machines explains this colossal difference in performance. Can anybody suggest any settings I may need to change?

Updates in answers to questions:

  1. Second server is 32 bit Windows
  2. I'm inquiring now about the disk arrays and how comparable they are
  3. On the primary server, the data and logs are on the same drive (!) and it works fine
  4. Looking in task manager on the test server, the CPU is running at like 10%, only one core even showing activity

  5. Task manager on the test server (4GB RAM) shows "PF Usage 2.01GB" with SQL Server running. On the primary server (8GB RAM) it shows "PF Usage 6.67GB". How would I make SQL Server on the test box use more of the RAM? Maybe that would make a difference

Another update:

The primary server has a RAID-5 with 15,000 RPM drives. The test box has a RAID-5 with 10,000 RPM drives.

A: 

You either have a different plan being generated or some hardware differences. For hardware you can check the disk seconds/[read,write] (edit to clarify - you do this in perfmon) and see if you have some massive differences from caching (e.g. high perf raid controller).

For the plan difference just check out the execution plans. Also do set statistics io on and see if you are getting physical reads instead of logical reads. Maybe the mem difference is keeping your dataset from fitting in memory in secondary but not primary machine.

I checked the execution plan for the proc on test, and it didn't leap out as having any table scans or anything like that. As a whole the thing just took too long. I'm wondering if it's memory as you suggest.
rc1
A: 

was the data in the memory cache yet? or was it all read from disk

KM
Also what is the memory usage of the 8GB server...if SQL is using it all then that could be your answer, it's caching more data.
jvanderh
Task manager on the test server (4GB RAM) shows "PF Usage 2.01GB" with SQL Server running. On the primary server (8GB RAM) it shows "PF Usage 6.67GB". How would I make SQL Server on the test box use more of the RAM? Maybe that would make a difference.
rc1
on a fresh install, newly started database server, it wouldn't have much of anything in in the cache yet. run it a few times on the test machine and see if it gets faster. also, if there is less memory for cache, then you will read from the disk more, which is slow.
KM
+3  A: 

32 bit OS means 2 GB Virtual Address Space for your processes. Standard edition OS mean no AWE extensions either. So your test machine will be severely RAM deprived compared with the production one. Your buffer pool will suffer from premature eviction of the pages, your execution plans will not have the option to choose hash-joins for a lot of queries and so on and so forth. I doubt this explains the entire difference, I'm sure there must be something more at play. You say only 10 CPU usage during the query, is your MAXDOP setting 1 by any chance on the test server? Have you compared the output of sp_configure on the two machines? (make sure you enable 'advanced options' too).

Can you run the same problem query on the two machines, from a SSMS query window, with SET STATISTICS IO ON and SET STATISTICS TIME ON? Run it 2-3 times on each and write down the results. Does it show the same number of logical reads but vastly different number of physical reads? This would point to the RAM being insufficient to cache the needed pages. IS the number of logical reads very different? It probably means you get a bad execution plan on test.

Is the query write intensive by any chance? If so did you pre-grow the test database or is your execution blocked by log growth and database growth events?

There are plenty of places to look at to narrow down the issue, like SQL performance counters, sys.dm_os_wait_stats, check the sys.dm_exec_requests wait_type and wait_resource.

Remus Rusanu
I bet that's it. The table being updated by the proc is 5,000,000 rows long. I think it's memory deprivation as you suggest. Many thanks.
rc1
MAXDOP is zero on both boxes by the way.
rc1
I would flatten the test box and reinstall 64 bit os and sql no matter what result you find. 32 bit is a dead end. You'll be tuning your app for issues simply don't exist in production.
Remus Rusanu
That would have been my choice if I'd set the box up myself to begin with. Thanks again!
rc1
A: 

Although you may not be able to use AWE on your 32-bit server, you can provide SQL Server with a little more memory by adding the /3GB switch to the boot.ini file. Check out Books Online, it should give you more information.