views:

768

answers:

3

It seems that using Critical Sections quite a bit in Vista/Windows Server 2008 leads to the OS not fully regaining the memory. We found this problem with a Delphi application and it is clearly because of using the CS API. (see this SO question)

Has anyone else seen it with applications developed with other languages (C++, ...)?

The sample code was just initialzing 10000000 CS, then deleting them. This works fine in XP/Win2003 but does not release all the peak memory in Vista/Win2008 until the application has ended.
The more you use CS, the more your application retains memory for nothing.

A: 

You're seeing something else.

I just built & ran this test code. Every memory usage stat is constant - private bytes, working set, commit, and so on.

int _tmain(int argc, _TCHAR* argv[])
{
    while (true)
    {
        CRITICAL_SECTION* cs = new CRITICAL_SECTION[1000000];
        for (int i = 0; i < 1000000; i++) InitializeCriticalSection(&cs[i]);
        for (int i = 0; i < 1000000; i++) DeleteCriticalSection(&cs[i]);
        delete [] cs;
    }

    return 0;
}
Michael
thanks Michael. Did you monitor the memory for your application when it is started and idle, while running the CS test, and, after it has done it, while idle and still not terminated. The memory is always fully recovered after the app has ended, the problem is when it is still active after it has used the CS a lot. (hope it's clear)
François
Note the infinite loop (while(true)). I monitored it while the application was actively running creating and deleting critical sections. Memory usage was constant, as expected.
Michael
I would expect it to go up and down when you create and delete your critical sections showing nice saw teeth in Process Explorer (private bytes). BTW I did it with 10,000,000 if it makes a difference.
François
What if you insert a "readln from std input" after your /delete [] cs;/ to pause it between each cycle? Memory footprint should go down to starting value.
François
I'm not sure what you want me to prove . . . allocating and freeing billions of critical sections in this test app maintained a constant memory usage over time. If there were a leak in creating/deleting critical sections in the system, even as small as one byte per critical section, we would have seen resource usage go up dramaticly
Michael
Did you measure the initial memory footprint, before while(true), and just before returning?
+1  A: 

Your test is most probably not representative of the problem. Critical sections are considered "lightweight mutexes" because a real kernel mutex is not created when you initialize the critical section. This means your 10M critical sections are just structs with a few simple members. However, when two threads access a CS at the same time, in order to synchronize them a mutex is indeed created - and that's a different story.

I assume in your real app threads do collide, as opposed to your test app. Now, if you're really treating critical sections as lightweight mutexes and create a lot of them, your app might be allocating a large number of real kernel mutexes, which are way heavier than the light critical section object. And since mutexes are kernel object, creating a excessive number of them can really hurt the OS.

If this is indeed the case, you should reduce the usage of critical sections where you expect a lot of collisions. This has nothing to do with the Windows version, so my guess might be wrong, but it's still something to consider. Try monitoring the OS handles count, and see how your app is doing.

eran
The main point is that the very same code (just calling the API) works well on XP/2003 but not on Vista/2008.
François
I have no idea how your code works, but at least for a test app that repetatively allocates and frees memory, it is possible that the memory manager is trying to cache free memory rather than return it to the OS, assuming it will be required soon. Vista and XP probably have different memory managers, hence the difference. Does the same happen when allocating some other arbitrary structure with the same size instead of real CS? Do you actually see a lot of handles being created?
eran
A: 

Microsoft have indeed changed the way InitializeCriticalSection works on Vista, Windows Server 2008, and probably also Windows 7.
They added a "feature" to retain some memory used for Debug information when you allocate a bunch of CS. The more you allocate, the more memory is retained. It might be asymptotic and eventually flatten out (not fully bought to this one).
To avoid this "feature", you have to use the new API InitalizeCriticalSectionEx and pass the flag CRITICAL_SECTION_NO_DEBUG_INFO.
The advantage of this is that it might be faster as, very often, only the spincount will be used without having to actually wait.
The disadvantages are that your old applications can be incompatible, you need to change your code and it is now platform dependent (you have to check for the version to determine which one to use). And also you lose the ability to debug if you need.

Test kit to freeze a Windows Server 2008:
- build this C++ example as CSTest.exe

#include "stdafx.h" 
#include "windows.h" 
#include <iostream> 

using namespace std; 

void TestCriticalSections() 
{ 
  const unsigned int CS_MAX = 5000000; 
  CRITICAL_SECTION* csArray = new CRITICAL_SECTION[CS_MAX];  

  for (unsigned int i = 0; i < CS_MAX; ++i)  
    InitializeCriticalSection(&csArray[i]);  

  for (unsigned int i = 0; i < CS_MAX; ++i)  
    EnterCriticalSection(&csArray[i]);  

  for (unsigned int i = 0; i < CS_MAX; ++i)  
    LeaveCriticalSection(&csArray[i]);  

  for (unsigned int i = 0; i < CS_MAX; ++i)  
    DeleteCriticalSection(&csArray[i]); 

  delete [] csArray; 
} 

int _tmain(int argc, _TCHAR* argv[]) 
{ 
  TestCriticalSections(); 

  cout << "just hanging around..."; 
  cin.get(); 

  return 0; 
}

-...Run this batch file (needs the sleep.exe from server SDK)

@rem you may adapt the sleep delay depending on speed and # of CPUs 
@rem sleep 2 on a duo-core 4GB. sleep 1 on a 4CPU 8GB. 

@for /L %%i in (1,1,300) do @echo %%i & @start /min CSTest.exe & @sleep 1 
@echo still alive? 
@pause 
@taskkill /im cstest.* /f

-...and see a Win2008 server with 8GB and quad CPU core freezing before reaching the 300 instances launched.
-...repeat on a Windows 2003 server and see it handle it like a charm.

François

related questions