views:

962

answers:

2

I am using an open source .Net library which uses MSMQ underneath. After about a week or 2, the service slows down (not timed exactly but general guess). It appears that what is happening is messages from MSMQ are only being read exactly once every 10 seconds. Normally, they are read instantly. So they will be read at T+10sec, T+20sec, T+30sec, etc. independent of when the message was sent (i.e. sometimes it takes 3 seconds for the message to be read, other times 9 seconds).

The current way I get it back to normal is simply deleting & recreating the queues. So the question is, what can build up in MSMQ queues to cause this kind of slow down? There are no messages in the queues when the slowdown occurs. Are there any advanced MSMQ analysis tools that give you a deeper look at the queues (as opposed to Computer Management)?

Oh, I forgot to mention, writing the messages to the queue still appears to be instantaneous. It is just reading the messages which shows this behavior.

EDIT: Follow up question @ here which is a bit more detailed and more focused.

A: 

Does stopping and starting the service make a change?

Igal Serban
Unfortunately not. Restarting the computer doesn't fix it either.
mrnye
+1  A: 

It has been a while since I did MSMQ stuff so bear with me (be tolerant) of my old brain memory. Some questions come to mind:

Is the journal queue active? This is tied to a queue and if the queue is removed, I believe (don't quote me) that the journal then starts from empty...

Is there a Purge process on the queue? Are these transactional queues?

Are all current service packs applied? I seem to recall a fix on a service pack for this in the distant past. It is probably platform dependent but you do not list your platform. I believe there were several service packs of this nature.

What is your disk space like? Are you getting into a disk space issue or a fragmented disk issue? Files (if I remember) are stored in under windows\system32\msmq. If there are not enough blocks of the size needed, it might slow down - usually on store /receipt of the message, not sure about reads.

Are these public or private queues?

What does performance monitor say about paged pool etc? I believe it is 70-80 bytes per message.

EDIT 1:

from the MSDN document archives: "Private queues are registered on the local computer, not in the directory service, and their properties cannot be obtained by Message Queuing applications running on remote computers. Message Queuing registers private queues locally by storing a description of each queue in a separate file in the local queue storage (LQS) folder on the local computer."

Thus, if the directory service is down, public queues are not available. Private queues on the other hand are stored in the local file system.

This machine should have a LOT of memory to avoid disk swap of the queue. I have never tried to run MSMQ on an Windows XP box frankly due to the issue of performance - have always had it executing on a dedicated server box with a LOT of memory - but then, we were working with a large set of queued items, of large size each. (many thousands, near the limit in size each)

EDIT 2: File system and empty files:

"Empty message files are deleted once every six hours, by default. This time interval can be controlled in the registry by setting the MessageCleanupInterval REG_DWORD value under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSMQ\Parameters. Applications should not keep cursors opened unnecessarily for a long time. Cursors might point to messages that were already received (and removed from files). These pointers prevent cleanup and deletion of empty files.

NOTE: You should have a significant amount of free space on the disk this is running on to prevent fragmentation issues. You COULD TRY reducing this 6 hour time frame to see if that helps - so it does not have to keep track of empty files.

Try to make sure there is not any other activity on the machine which might fragment the drive - such as browsing the Internet.

On a public queue, sometimes this is extended to avoid the expense of Active Directory interaction which can be expensive. You need to add this registry value on both ends of a Message Queuing session. Otherwise, the computer with the smallest value will stop the session prematurely. A common reason to add this registry value with a large value is to keep sessions alive and avoid the overhead of creating Message Queuing sessions. In your instance, you might need to shorten the value to aid in file systems management. (This is a hard call to make, and frankly needs to be made with consideration).

As for hot fixes: (probably more out there, and not specific to your problem that I can see)

The MSMQ 1.0 hotfix described in http://support.microsoft.com/kb/304212. The reason for this hotfix is that Windows XP Message Queuing 3.0 independent clients are built as robust RPC clients. Without this hotfix, calls from Windows XP Message Queuing 3.0 independent clients to MQLocateNext fail.

The RPC hotfix is described in http://support.microsoft.com/kb/823980. This hotfix is required to enable auditing on the client running Windows XP. This hotfix is also required to enable clients running Windows 2000 to complete Setup.

One thought: what does the Performance monitor say about your memory status issues? Is there a lot of disk swapping going on there?

Mark Schultheiss
mrnye
Sorry that last comment didn't format well... I've also found a tool called TMQ which seems to do some extra analysis. When the slowdown occurs again I will try debugging further.
mrnye
Yep, here is a link to a MS page detailing a lot of stuff regarding performance, memory, how it is managed etc.http://msdn.microsoft.com/en-us/library/ms811056.aspxMore details on message size, numbers etc. might be useful, as well as examination of the file system where the stores are maintained.
Mark Schultheiss
Just an FYI, you can spend days on tedious examination and diagnosis with MSMQ observing different patterns and isolating issues. So many factors here to consider.
Mark Schultheiss
Thanks for the thorough answers. RE: EDIT 1, I don't think it's a memory issue as the problem occurs when the queues are empty. RE: EDIT 2, I will check the disk fragmentation. At the moment I am just waiting for it to occur again so I can try to detect anything unusual.
mrnye