When using objects that have a capacity, what are some guidelines that you can use to ensure the best effeciency when using to collections? It also seems like .NET framework has set some of these capacities low. For example, I think StringBuilder has an intial capacity of 16. Does this mean that after 16 strings are inserted into the StringBuilder, the StringBuilder object is reallocated and doubled in size?
views:
152answers:
4With StringBuilder
, it isn't the number of strings, but the number of characters. In general; if you can predict the length, go ahead and tell it - but since it uses doubling, there isn't a huge overhead in reallocating occasionally if you need to juts use Add
etc.
In most cases, the difference will be trivial and a micro-optimisation. The biggest problem with not telling it the size is that unless the collection has a "trim" method, you might have nearly double the size you really needed (if you are very unlucky).
If you know how large a collection or StringBuilder will be up front, it is good practice to pass that as the capacity to the constructor. That way, only one allocation will take place. If you don't know the precise number, even an approximation can be helpful.
There are only two circumstances where I ever explicitly set the capacity of a collection
- I know the exact number of items that will appear in the collection and I'm using an Array or List<T>.
- I am PInvoking into a function which writes to a char[] and i'm using a StringBuilder to interop with parameter. In this case you must set a capacity for the CLR to marshal to native code.
Interestingly, for #1 it is almost always done when I am copying data returned from a COM interface into a BCL collection class. So I guess you could say I only ever do this in interop scenarios :).
Speaking of StringBuilder
, I'd dare to use the worst-case size. StringBuilder
requires contigous memory block, which is hard to allocate on a highly fragmented heap.
I'd go with an estimation for other collections, though.