views:

1590

answers:

3

Before I start with the real question, let me just say that I might get some of the details here wrong. If so, please arrest me on those as well as, or even instead of answering my question.

My question is about DLLs and .NET, basically. We have an application that is using quite a bit of memory and we're trying to figure out how to measure this correctly, especially when the problem mainly occurs on clients' computers.

One thing that hit me is that we have some rather large .NET assemblies with generated ORM-code.

If I were using an unmanaged (Win32) DLL that had a unique base-address, multiple simultaneous processes on the same machine would load the DLL once into physical memory, and just map it into virtual memory for all the applications. Thus, physical memory would be used once for this DLL.

The question is what happens with a .NET assembly. This DLL contains IL, and though this portion of it might be shared between the applications, what about the JITted code that results from this IL? Is it shared? If not, how do I measure to figure out of this is actually contributing to the problem or not? (Yes, I know, it will contribute, but I'm not going to spend much time on this until it is the biggest problem).

Also, I know that we haven't looked at the base address for all the .NET assemblies in our solution, is it necessary for .NET assemblies to do so? And if so, are there some guidelines available on how to determine these addresses?

Any insight into this area would be most welcome, even if it turns out that this is not a big problem, or not even a problem at all.


Edit: Just found this question: .NET assemblies and DLL rebasing which partially answers my question, but I'd still like to know how JITted code factors into all of this.

It appears from that question and its accepted answer that the JITted code is placed on a heap, which means that each process will load up the shared binary assembly image, and produce a private JITted copy of the code inside its own memory space.

Is there any way for us to measure this? If this turns out to produce a lot of code, we'd have to look at the generated code more to figure out if we need to adjust it.


Edit: Added a shorter list of questions here:

  1. Is there any point in making sure base addresses of .NET assemblies are unique and non-overlapping to avoid rebasing a dll that will mostly be used to just get IL code out of for JITting?
  2. How can I measure how much memory is used for JITted code to figure out if this is really a problem or not?

The answer by @Brian Rasmussen here indicates that JITting will produce per-process copies of JITted code, as I expected, but that rebasing the assemblies will actually have an effect in regards of reduced memory usage. I will have to dig into the WinDbg+SoS tools he mentions, something I've had on my list for a while but now I suspect I can't put it off any longer :)


Edit: Some links I've found on the subject:

A: 

I think you're getting confused about shared assemblies and dlls and the process memory space.

Both .NET and standard Win32 DLL share code among the different process using them. In the case of .NET this is only true for DLLs with the same version signature so that two different versions of the same dll can be loaded in memory at the same time.

The thing is it looks like you're expecting the memory allocated by the library calls to be shared as well, well that never (almost) happens. When a function inside your library allocates memory, and I guess that happens a lot for an ORM DLL, that memory is allocated inside the memory space of the calling process, each process having unique instances of the data.

So yes, in fact the DLL code is being loaded once and shared among the callers but the code instructions (and therefore the allocations) take place separately into the calling process space.

Edit: Ok, Let's see how JIT works with .NET assemblies.

When we talk about JITing the code the process is relatively simple. Internally there's a structure called the Virtual Method Table which basically contains the virtual address that will be invoked during a call. In .NET, JIT works by basically editing that table so that every single call redirects to the JIT compiler. That way, any time we call a method the JIT steps in and compiles the code to the actual machine instructions (hence the Just In Time), once that has been done, the JIT goes back to the VMT and substitutes the old entry that invoked him to point the generated low level code. That way, all subsequent calls will be redirected to the compiled code (so we just compile once). So the JIT is not invoked every time and all subsequent calls will redirect to the same compiled code. For DLLs the process is likely to be the same (although I can't completely assure you it is).

Jorge Córdoba
But are they sharing code? I assume they will be sharing the IL instructions as these will be loaded in as part of the DLL binary image, but that code has to be JITted before executed, is this shared in any way? Forgive me if you answers this, then I'm not understanding your answer.
Lasse V. Karlsen
+2  A: 

This is for question 1)

The jitted code is placed on a special heap. You can inspect this heap using the !eeheap command in WinDbg + SoS. Thus every process will have its own copy of the jitted code. The command will also show you the total size of the code heap.

Let me know if you want additional details on getting this information from WinDbg.

This is for question 2)

According to the book Expert .NET 2.0 IL Assembly the .reloc part of a pure-IL PE file contains only one fixup entry for the CLR startup stub. So the amount of fixups needed for a managed DLL during rebasing is fairly limited.

However, if you list any given managed process, you'll notice that Microsoft has rebased the bulk (or maybe all) of their managed DLLs. Whether that should be viewed as an reason for rebasing or not is up to you.

Brian Rasmussen
Ok, this looks promising, at least we can get some figures on this. Thanks!
Lasse V. Karlsen
I figure, based on the other answers here too, that I need to rebase, if not for anything else than to avoid private copies of the dll's, instead of shared mapped references. Thanks for the info!
Lasse V. Karlsen
Hope you notice this question Brian, do you know if there is an online site I can purchase a pdf-version of this book? Only found an APRESS site that delivers password-protected files, which sounds cumbersome, rather have one that is digitally signed to my adobe account...
Lasse V. Karlsen
I have a hard copy, but I believe there's a PDF version available as well. I recall seeing a "get the PDF for just ..." on the back. I'll check when I get home (don't have the book here at work).
Brian Rasmussen
Well, I should have guessed. The book is available electronically from Apress, so I guess you're already up to date on that. However, I found a press release saying that Apress books is available on Safari so that may be another option.
Brian Rasmussen
+2  A: 

I'm not sure how accurate the following infomrationis with newer versions of .NET and/or Windows versions. MS may have addressed some of the DLL loading/sharing issues since the early days of .NET. But I believe that much of the following still does apply.

With .NET assemblies a lot of the benefit of page sharing between processes (and between Terminal server sessions) disappears because the JIT needs to write the native code on the fly - there's no image file to back up the native code. So each process gets it's own, separate memory pages for the jitted code.

This is similar to the issues that are caused by having DLLs improperly based - if the OS needs to perform fixups on a standard Win32 DLL when it's loaded, the memory pages for the fixed up portions cannot be shared.

However, even if the jitted code cannot be shared, there is a benefit to rebasing .NET DLLs because the DLL is still loaded for the metadata (and IL) - and that stuff can be shared if no fixups are required.

It's possible to help share memory pages with a .NET assembly by using ngen. but that brings along its own set of issues.

See this old blog post by Jason Zander for some details:

http://blogs.msdn.com/jasonz/archive/2003/09/24/53574.aspx

Larry Osterman has a decent blog article on DLL page sharing and the effect of fixups:

http://blogs.msdn.com/larryosterman/archive/2004/07/06/174516.aspx

Michael Burr
I will look at those links, ngen was one option we put on the list of things to explore if this turns out to be a problem. Thanks!
Lasse V. Karlsen