views:

421

answers:

4

In my ASP.NET app, I have a dictionary that contains a large number of objects (let's say as large as 1M, could get even bigger later), the object stored in the dictionary is of reference type not struct type, the dictionary is meant to work as a cache for this type of objects (I have my reasons not to use the ASP.NET cache for caching this type of object, I use it for caching other things though).

Can this cause problems with the GC? I keep hearing that long living objects affect the performance of the GC and cause it to take more time during collecting, is there anyway to avoid this? As far as I understand the dictionary (and the objects) should end up in Gen2 which the GC doesn't collect unless the system is low in memory (please correct me if I'm wrong), so what if the system has a lot of memory, will the GC still collect Gen2 with the same frequency? There must be other applications that cache large amounts of data for a long time too, I wonder how they avoid the problems with the GC.

Any suggestion are really appreciated ...

+1  A: 

You're understanding of the scenario is correct, as long as the system isn't craving memory Gen2 collections should be infrequent and therefore keeping many long lived objects in there shouldn't be a problem.

If you had to code carefully to avoid "problems" with the GC the GC wouldn't be doing its job very well.

The only area that really needs concern is large allocations (a single allocation > 86KB) having lots of these coming and going may be a concern.

AnthonyWJones
A: 

I can't decide if it's wise, but if you don't want it to be collected, use the following code:

GC.KeepAlive(object);
Stormenet
Dude, thats not what they're asking. Also why is KeepAlive unwise? It has it's place and it's uses.
Binary Worrier
Thanks for your answer but as I see in the documentation for GC.KeepAlive(), it should only be used in cases where you have no references to the object and the object is being used in unmanaged code (there are other scenarios too but they also don't apply to my case) ..
Waleed Eissa
Ok ok, dudes, just trying to help.
Stormenet
A: 

You need to be careful that the items don't end up in the Large Object Heap (>85K). If they remain resident, it's not a problem, but if the data is swapped in and out, you can end up with memory fragmentation since the LOH isn't compacted.

Ryan Emerle
the items in the dictionary are much smaller than 85k, but what about the dictionary object itself? it will be larger than 85k for sure, will it end in the LOH?
Waleed Eissa
+3  A: 

At circa 1M entries the dictionaries underlying array will inevitably end up in the LOH. Given this it may be a good idea to preallocate it at a size commensurate with the number of entries it ends up containing (this can badly back fire, and can be a pain in test mode).

The objects in the dictionary should provide a very good hashcode implementation. This is not relevant to the memory usage directly just the efficiency of lookup. However if you trade a higher load factor for less memory use a bad hash may eventually degrade the dictionaries behaviour to a point it is no longer O(1).

The long lived items it contains should either:

  1. (Best case) contain no instance fields which are reference types.

    • The GC then has no need to traverse each of these objects in the array (the GC implementation may not perform this optimization so your milage may vary).
  2. Ensure your application does not drift into Mid Life Crisis style behaviour

    • Triggering more of the gen2 collections you wish to avoid.
  3. Any references they contain should not point into the Gen0 or Gen1 heaps

    • This would cause a large number of write barriers which would slow down your gen 0 / gen 1 collections.

Since you state this is a cache but do not mention your retention policy I think this advice merits a mention too.

Some indication of the sort of objects you are caching may be relevant 4/8MB or contiguous memory is an awful lot even with modern processors caches so frequent requests to the cache may well have worse performance than a tighter, smaller cache with a better retention policy.

ShuggyCoUk