views:

321

answers:

3

When my task manager (top, ps, taskmgr.exe, or Finder) says that a process is using XXX KB of memory, what exactly is it counting, and how does it get updated?

In terms of memory allocation, does an application written in C++ "appear" different to an operating system from an application that runs as a virtual machine (managed code like .NET or Java)?

And finally, if memory is so transparent - why is garbage collection not a function-of or service-provided-by the operating system?


As it turns out, what I was really interested in asking is WHY the operating system could not do garbage collection and defrag memory space - which I see as a step above "simply" allocating address space to processes.

These answers help a lot! Thanks!

+4  A: 

This is a big topic that I can't hope to adequately answer in a single answer here. I recommend picking up a copy of Windows Internals, it's an invaluable resource. Eric Lippert had a recent blog post that is a good description of how you can view memory allocated by the OS.

Memory that a process is using is basically just address space that is reserved by the operating system that may be backed by physical memory, the page file, or a file. This is the same whether it is a managed application or a native application. When the process exits, the operating system deletes the memory that it had allocated for it - the virtual address space is simply deleted and the page file or physical memory backings are free for other processes to use. This is all the OS really maintains - mappings of address space to some physical resource. The mappings can shift as processes demand more memory or are idle - physical memory contents can be shifted to disk and vice versa by the OS to meet demand.

What a process is using according to those tools can actually mean one of several things - it can be total address space allocated, total memory allocated (page file + physical memory) or memory a process is actually using that is resident in memory. Task Manager has a separate column for each of these possibilities.

The OS can't do garbage collection since it has no insight into what that memory actually contains - it just sees allocated pages of memory, it doesn't see objects which may or may not be referenced.

Whereas the OS handles allocates at the virtual address level, in the process itself there are other memory managers which take these large, page-sized chunks and break them up into something useful for the application to use. Windows will return memory allocated in 64k boundaries, but then the heap manager breaks it up into smaller chunks for use by each individual allocation done by the program via new. In .Net applications, the CLR will hand off new objects off of the garbage collected heap and when that heap reaches its limits, will perform garbage collection.

Michael
Good links! Thanks for the explanation of how the work is divided between OS and memory managers. I never actually took an OS course in college, but I DO remember malloc and finding the "right" size to request for my FFT routines.
Jeff Meatball Yang
+2  A: 

I can't speak to your question about the differences in how the memory appears in C++ vs. virtual machines, etc, but I can say that applications are typically given a certain memory range to use upon initialization. Then, if the application ever requires more, it will request it from the operating system, and the operating system will (generally) grant it. There are many implementations of this - in some operating systems, other applications' memory is moved away so as to give yours a larger contiguous block, and in others, your application gets various chunks of memory. There may even be some virtual memory handling involved. Its all up to an abstracted implementation. In any case, the memory is still treated as contiguous from within your program - the operating system will handle that much at least.

With regard to garbage collection, the operating system knows the bounds of your memory, but not what is inside. Furthermore, even if it did look at the memory used by your application, it would have no idea what blocks are used by out-of-scope variables and such that the GC is looking for.

JoshJordan
+2  A: 

The primary difference is application management. Microsoft distinguishes this as Managed and Unmanaged. When objects are allocated in memory they are stored at a specific address. This is true for both managed and unmanaged applications. In the managed world the "address" is wrapped in an object handle or "reference".

When memory is garbage collected by a VM, it can safely suspend the application, move objects around in memory and update all the "references" to point to their new locations in memory.

In a Win32 style app, pointers are pointers. There's no way for the OS to know if it's an object, an arbitrary block of data, a plain-old 32-bit value, etc. So it can't make any inferences about pointers between objects, so it can't move them around in memory or update pointers between objects.

Because the way references are handled, the OS can't take over the GC process and instead it's left up to the VM to manage the memory used by the application. For that reason, VM applications appear exactly the same to the OS. They simply request blocks of memory for use and the OS gives it to them. When the VM performs GC and compacts it's memory it's able to free memory back to the OS for use by another app.

Paul Alexander
You read my mind! I couldn't figure out how a VM could manage memory for multiple apps, yet the OS couldn't.
Jeff Meatball Yang
Why the down vote? Is there something incorrect about the post?
Paul Alexander
I hate when people down-vote and don't leave a reason. That should be a requirement.
Jeff Meatball Yang
And just for the record - I didn't down-vote - in fact I upvoted you.
Jeff Meatball Yang