views:

1045

answers:

6

I understand that creating too many threads in an application isn't being what you might call a "good neighbour" to other running processes, since cpu and memory resources are consumed even if these threads are in an efficient sleeping state.

What I'm interested in is this: How much memory (win32 platform) is being consumed by a sleeping thread?

Theoretically, I'd assume somewhere in the region of 1mb (since this is the default stack size), but I'm pretty sure it's less than this, but I'm not sure why.

Any help on this will be appreciated.

(The reason I'm asking is that I'm considering introducing a thread-pool, and I'd like to understand how much memory I can save by creating a pool of 5 threads, compared to 20 manually created threads)

A: 

I think you'd have a hard time detecting any impact of making this kind of a change to working code - 20 threads down to 5. And then add on the added complexity (and overhead) of managing the thread pool. Maybe worth considering on an embedded system, but Win32?

And you can set the stack size to whatever you want.

le dorfier
A: 

This depends highly on the system:

But usually, each processes is independent. Usually the system scheduler makes sure that each processes gets equal access to the available processor. Thus a multi threaded application time is multiplexed between the available threads.

Memory allocated to a thread will affect the memory available to the processes but not the memory available to other processes. A good OS will page out unused stack space so it is not in physical memory. Though if your threads allocate enough memory while live you could cause thrashing as each processor's memory is paged to/from secondary device.

I doubt a sleeping thread has any (very little) impact on the system.

  • It is not using any CPU
  • Any memory it is using can be paged out to a secondary device.
Martin York
+5  A: 

I have a server application which is heavy in thread usage, it uses a configurable thread pool which is set up by the customer, and in at least one site it has 1000+ threads, and when started up it uses only 50 MB. The reason is that Windows reserves 1MB for the stack (it maps its address space), but it is not necessarily allocated in the physical memory, only a smaller part of it. If the stack grows more than that a page fault is generated and more physical memory is allocated. I don't know what the initial allocation is, but I would assume it's equal to the page granularity of the system (usually 64 KB). Of course, the thread would also use a little more memory for other things when created (TLS, TSS, etc), but my guess for the total would be about 200 KB. And bear in mind that any memory that is not frequently used would be unloaded by the virtual memory manager.

Fabio Ceconello
The page size for x86 and x64 is 4KB, for ia64 it is typically 8KB but is configurable.
Rob Walker
The allocation granularity (as returned from GetSystemInfo()) is 64 KB on x86 and x64. The VirtualAlloc() documentation seems to say that reservations are restricted by the allocation granularity, but pages in a block of reserved memory can be individually committed.
bk1e
Also, Raymond Chen blogged about thread stack sizes a few years ago: http://blogs.msdn.com/oldnewthing/archive/2005/07/29/444912.aspx
bk1e
+1  A: 

If you're using Vista or Win2k8 just use the native Win32 threadpool API. Let it figure out the sizing. I'd also consider partitioning types of workloads e.g. CPU intensive vs. Disk I/O into different pools.

MSDN Threadpool API docs

http://msdn.microsoft.com/en-us/library/ms686766(VS.85).aspx

stephbu
A: 

I guess this can be measured quite easily.

  1. Get the amount of resources used by the system before creating a thread
  2. Create a thread with default system values (default heap size and others)
  3. Get the amount of resources after creating a thread and make the difference (with step 1).

Note that some threads need to be specified different values than the default ones.

You can try and find an average memory use by creating various number of threads (step 2).

The memory allocated by the OS when creating a thread consists of threads local data: TCB TLS ...

From wikipedia: "Threads do not own resources except for a stack, a copy of the registers including the program counter, and thread-local storage (if any)."

Iulian Şerbănoiu
+2  A: 

Adding to Fabios comments:

Memory is your second concern, not your first. The purpose of a threadpool is usually to constrain the context switching overhead between threads that want to run concurrently, ideally to the number of CPU cores available.

A context switch is very expensive, often quoted at a few thousand to 10,000+ CPU cycles.

A little test on WinXP (32 bit) clocks in at about 15k private bytes per thread (999 threads created). This is the initial commited stack size, plus any other data managed by the OS.

peterchen