In CLR 2.0, is there any way to view all of the strings that have been interned? I've looked into the CLR Profiler APIs and can't see any API calls to monitor when a string gets interned. Also, what is the scope of interned strings? Do interned strings get collected when the App Domain gets unloaded, or do they span App Domains?
+2
A:
Strings do get interned by default in .NET 2.0, however which strings get interned and when can be fairly complex. The following article might shed some light on the concept:
http://community.bartdesmet.net/blogs/bart/archive/2006/09/27/4472.aspx
Also, in regards to your API calls...make sure you are testing with an optimized build. A Debug build may not enable string interning by default, which might be why you don't see it happening.
jrista
2010-02-24 19:53:03
Thx for the clarification.Somehow it seems I always manage to fsck up on this one. :-|
andras
2010-02-24 20:22:04
Thanks for the link, I was actually able to find that one on goog. I'm working on a hand-rolled ETL process that does a lot of string manip and we're seeing a lot of memory consumption. I was speculating that an API I was leveraging was doing string interning behind the scenes and was looking for something like a perfmon counter or a VS debug window where I could see the strings in the intern "pool" or monitor the growth rate of interned strings.
tferreira
2010-02-24 22:03:53
I don't believe there is anything that can monitor the interned strings. You might be able to write something that does that, however it would likely be a waste of time. There are a few key cases where strings get interned. Most commonly, it is constant string data, either strings that are compiled into your assembly, or strings that are for all intents and purposes static at runtime. Generally speaking, I wouldn't mess with .NET's automatic intern management.
jrista
2010-02-24 23:22:49
If you are seeing a lot of memory usage while doing string manipulation, then I would be more worried about how you are manipulating your strings than whether they are interned. Interning is a string memory usage optimization...so if strings are interned, they should use less. However, dynamically modifying a string is very unlikely to use interning. A StringBuilder is obviously the best choice...and creating your builders with a large initial capacity can help with memory usage by reducing (or eliminating) resizes of the internal buffer (which requires creating a new one and copying.)
jrista
2010-02-24 23:24:38
All good suggestions that we are currently looking into. Our concern was that a lower level API was calling string.intern without our knowledge and that the intern hashtable was growing exponentially as we continued to process files due to the fact that the hash table used to store the interned strings was never getting collected (i.e. it's CLR scoped). We've put a profiler on the app to work through the bottlenecks. Hopefully we can find the source of the memory growth. Thanks for the feedback.
tferreira
2010-02-26 17:48:41
No problem. As for the intern table growing...that is very doubtful. String interning is a very highly controlled and optimized thing in .NET with rules that strictly govern the types and lengths of strings that end up in there. There are a lot of other basic and very common string operations that could be causing your memory growth to run away. Repetitive string concatenation of an arbitrary number of parts can EASILY consume vast amounts of memory. Improper use of a StringBuilder can also cause considerable growth as its internal buffer is duplicated during expansion.
jrista
2010-02-26 21:19:48