views:

600

answers:

4

in our sql server box(X64 machine with 8 core with 16G Ram), we found the performance is really bad after we a bunch of data been generated, some time we even cannot RDP to this box, there have several err msg on SQL error log as following:

*2009-06-26 12:11:09.92 spid63      Error: 14151, Severity: 18, State: 1.
2009-06-26 12:11:09.92 spid63      Replication-Replication Distribution Subsystem: agent ##job## failed. The process could not access database 'db_STR' on server 'Test01'.
2009-06-26 12:11:30.66 spid57      Replication-Replication Distribution Subsystem: agent ##job## scheduled for retry. The process could not connect to Subscriber 'Test01'.
2009-06-26 17:42:01.48 spid54      Replication-Replication Distribution Subsystem: agent ##Job## scheduled for retry. TCP Provider: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.
2009-06-26 17:50:05.96 spid69      Replication-Replication Distribution Subsystem: agent ##Job## scheduled for retry. TCP Provider: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.
2009-06-26 17:59:04.64 spid84      Replication-Replication Distribution Subsystem: agent ##Job## scheduled for retry. TCP Provider: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full.
2009-06-26 18:00:02.62 spid83      Error: 14151, Severity: 18, State: 1.
2009-06-26 18:00:02.62 spid83      Replication-Replication Distribution Subsystem: agent ##Job## failed. Not enough storage is available to process this command.  The step failed.
2009-06-26 18:00:02.76 Server      Error: 26040, Severity: 17, State: 1.
2009-06-26 18:00:02.76 Server      Server TCP provider has stopped listening on port [ 1433 ] due to a failure. Error: 0x2747, state: 2. The server will automatically attempt to reestablish listening.
2009-06-26 18:00:03.17 spid94      Error: 14151, Severity: 18, State: 1.*

I check all Harddrives and all have plenty of free spaces, the memory and cpu usage all looks fine, memory went up to 15G and cpu usage is normal at less 40%. I suspected it's page/non-page pool leak, however for x64 machine, the limit for non-page pool is > 6G and it used around 100M only, Is anyone have idea what's problem with the system?

Thanks

A: 

Looks like you're generating too many requests for the database to handle.

sangretu
that's one I'm suspect, however there are only 30-40 user connections and the peak connection reset/sec is 40
Allen
A: 

It might be the IO lag that could be the problem too.

Check the following pages:

Coentje
A: 

Do you have SQL Server configured with a RAM limit / i.e. Max Memory setting? If not, it may be taking more RAM from the OS than is helpful. I would try setting the max memory low enough to allow a few GB of RAM for the OS and other processes. Also, are you Enterprise with Lock Pages in Memory or Standard Edition? If standard, and SQL Server has grabbed more memory than it should, it might be paging, which is a perf killer.

If that's the issue, this is a great article:

http://blogs.msdn.com/slavao/archive/2006/11/13/q-a-does-sql-server-always-respond-to-memory-pressure.aspx

Also here:

http://www.johnsansom.com/index.php/2009/03/sql-server-memory-configuration-determining-memtoleave-settings/

onupdatecascade
A: 

It could be an I/O corruption issue. Please take a look at this MSDN forum post.

chitza