views:

381

answers:

12

This question has been puzzling me for a long time now. I come from a heavy and long C++ background, and since I started programming in C# and dealing with garbage collection I always had the feeling that such 'magic' would come at a cost.

I recently started working in a big MMO project written in Java (server side). My main task is to optimize memory comsumption and CPU usage. Hundreds of thousands of messages per second are being sent and the same amount of objects are created as well. After a lot of profiling we discovered that the VM garbage collector was eating a lot of CPU time (due to constant collections) and decided to try to minimize object creation, using pools where applicable and reusing everything we can. This has proven to be a really good optimization so far.

So, from what I've learned, having a garbage collector is awesome, but you can't just pretend it does not exist, and you still need to take care about object creation and what it implies (at least in Java and a big application like this).

So, is this also true for .NET? if it is, to what extent?

I often write pairs of functions like these:

// Combines two envelopes and the result is stored in a new envelope.
public static Envelope Combine( Envelope a, Envelope b )
{
    var envelope = new Envelope( _a.Length, 0, 1, 1 );
    Combine( _a, _b, _operation, envelope );
    return envelope;
}

// Combines two envelopes and the result is 'written' to the specified envelope
public static void Combine( Envelope a, Envelope b, Envelope result )
{
    result.Clear();
    ...
}

A second function is provided in case someone has an already made Envelope that may be reused to store the result, but I find this a little odd.

I also sometimes write structs when I'd rather use classes, just because I know there'll be tens of thousands of instances being constantly created and disposed, and this feels really odd to me.

I know that as a .NET developer I shouldn't be worrying about this kind of issues, but my experience with Java and common sense tells me that I should.

Any light and thoughts on this matter would be much appreciated. Thanks in advance.

+1  A: 

I have the same thought all the time.

The truth is, though we were taught to watch out for unnecessary CPU tacts and memory consumption, the cost of little imperfections in our code just negligible in practice.

If you are aware of that and always watch, I believe you are okay to write not perfect code. If you have started with .NET/Java and have no prior experience in low level programming, the chances are you will write very abusive and ineffective code.

And anyway, as they say, the premature optimization is the root of all evil. You can spend hours optimizing one little function and it turns out then that some other part of code gives a bottleneck. Just keep balance doing it simple and doing it stupidly.

User
+7  A: 

Yes, it's true of .NET as well. Most of us have the luxury of ignoring the details of memory management, but in your case -- or in cases where high volume is causing memory congestion -- then some optimizaition is called for.

One optimization you might consider for your case -- something I've been thinking about writing an article about, actually -- is the combination of structs and ref for real deterministic disposal.

Since you come from a C++ background, you know that in C++ you can instantiate an object either on the heap (using the new keyword and getting back a pointer) or on the stack (by instantiating it like a primitive type, i.e. MyType myType;). You can pass stack-allocated items by reference to functions and methods by telling the function to accept a reference (using the & keyword before the parameter name in your declaration). Doing this keeps your stack-allocated object in memory for as long as the method in which it was allocated remains in scope; once it goes out of scope, the object is reclaimed, the destructor is called, ba-da-bing, ba-da-boom, Bob's yer Uncle, and all done without pointers.

I used that trick to create some amazingly performant code in my C++ days -- at the expense of a larger stack and the risk of a stack overflow, naturally, but careful analysis managed to keep that risk very minimal.

My point is that you can do the same trick in C# using structs and refs. The tradeoffs? In addition to the risk of a stack overflow if you're not careful or if you use large objects, you are limited to no inheritance, and you tightly couple your code, making it it less testable and less maintainable. Additionally, you still have to deal with issues whenever you use core library calls.

Still, it might be worth a look-see in your case.

Randolpho
+5  A: 

This is one of those issues where it is really hard to pin down a definitive answer in a way that will help you. The .NET GC is very good at tuning itself to the memory needs of you application. Is it good enough that your application can be coded without you needing to worry about memory management? I don't know.

There are definitely some common-sense things you can do to ensure that you don't hammer the GC. Using value types is definitely one way of accomplishing this but you need to be careful that you don't introduce other issues with poorly-written structs.

For the most part however I would say that the GC will do a good job managing all this stuff for you.

Andrew Hare
+1  A: 

Although the Garbage Collector is there, bad code remains bad code. Therefore I would say yes as a .Net developer you should still care about how many objects you create and more importantly writing optimized code.

I have seen a considerable amount of projects get rejected because of this reason in Code Reviews inside our environment, and I strongly believe it is important.

Diago
+3  A: 

I've seen too many cases where people "optimize" the crap out of their code without much concern for how well it's written or how well it works even. I think the first thought should go towards making code solve the business problem at hand. The code should be well crafted and easily maintainable as well as properly tested.

After all of that, optimization should be considered, if testing indicates it's needed.

Terry Donaghe
+1  A: 

.NET Memory management is very good and being able to programmatically tweak the GC if you need to is good.

I like the fact that you can create your own Dispose methods on classes by inheriting from IDisposable and tweaking it to your needs. This is great for making sure that connections to networks/files/databases are always cleaned up and not leaking that way. There is also the worry of cleaning up too early as well.

AutomatedTester
+1  A: 

You might consider writing a set of object caches. Instead of creating new instances, you could keep a list of available objects somewhere. It would help you avoid the GC.

Jake Pearson
+1  A: 

I agree with all points said above: the garbage collector is great, but it shouldn't be used as a crutch.

I've sat through many wasted hours in code-reviews debating over finer points of the CLR. The best definitive answer is to develop a culture of performance in your organization and actively profile your application using a tool. Bottlenecks will appear and you address as needed.

bryanbcook
+1  A: 

It depends upon your application. Memory agnostic programming can be done for people doing simple GUIs and many developers fall into the category of not wanting to be server developers, not needing to have a comprehensive understanding of these things (if surrounded by other support programmers that do the more interesting work.)

How many programmers are there banging away SAP screens or focussing on tabular displays of data for business applications which are glorified database editors? They are absolved the problems of memory management, and for everyone concerned it's great.

However, any server with a high throughput, large amount of data management, etc will require developers that keep an eye on memory management and performance.

Garbage collection doesn't absolve engineers of these responsibilities - it does make some problems go away for some developers some of the time, and memory leaks proper have to be engineered in rather than being an act of forgetfulness or an inopportune exception.

polyglot
+3  A: 

Random advice:

Someone mentioned putting dead objects in a queue to be reused instead of letting the GC at them... but be careful, as this means the GC may have more crap to move around when it consolidates the heap, and may not actually help you. Also, the GC is possibly already using techniques like this. Also, I know of at least one project where the engineers tried pooling and it actually hurt performance. It's tough to get a deep intuition about the GC. I'd recommend having a pooled and unpooled setting so you can always measure the perf differences between the two.

Another technique you might use in C# is dropping down to native C++ for key parts that aren't performing well enough... and then use the Dispose pattern in C# or C++/CLI for managed objects which hold unmanaged resources.

Also, be sure when you use value types that you are not using them in ways that implicitly box them and put them on the heap, which might be easy to do coming from Java.

Finally, be sure to find a good memory profiler.

Drew Hoskins
+1 for the interesting suggestion that pooling may actually hurt performance.
JulianR
+1  A: 

I think you answered your own question -- if it becomes a problem, then yes! I don't think this is a .Net vs. Java question, it's really a "should we go to exceptional lengths to avoid doing certain types of work" question. If you need better performance than you have, and after doing some profiling you find that object instantiation or garbage collection is taking tons of time, then that's the time to try some unusual approach (like the pooling you mentioned).

twon33
+1  A: 

I wish I were a "software legend" and could speak of this in my own voice and breath, but since I'm not, I rely upon SL's for such things.

I suggest the following blog post by Andrew Hunter on .NET GC would be helpful:

http://www.simple-talk.com/dotnet/.net-framework/understanding-garbage-collection-in-.net/

Cyberherbalist