tags:

views:

152

answers:

4

When using objects that have a capacity, what are some guidelines that you can use to ensure the best effeciency when using to collections? It also seems like .NET framework has set some of these capacities low. For example, I think StringBuilder has an intial capacity of 16. Does this mean that after 16 strings are inserted into the StringBuilder, the StringBuilder object is reallocated and doubled in size?

+2  A: 

With StringBuilder, it isn't the number of strings, but the number of characters. In general; if you can predict the length, go ahead and tell it - but since it uses doubling, there isn't a huge overhead in reallocating occasionally if you need to juts use Add etc.

In most cases, the difference will be trivial and a micro-optimisation. The biggest problem with not telling it the size is that unless the collection has a "trim" method, you might have nearly double the size you really needed (if you are very unlucky).

Marc Gravell
+3  A: 

If you know how large a collection or StringBuilder will be up front, it is good practice to pass that as the capacity to the constructor. That way, only one allocation will take place. If you don't know the precise number, even an approximation can be helpful.

Dustin Campbell
Approximate high and then leave some room.
Henk Holterman
+2  A: 

There are only two circumstances where I ever explicitly set the capacity of a collection

  1. I know the exact number of items that will appear in the collection and I'm using an Array or List<T>.
  2. I am PInvoking into a function which writes to a char[] and i'm using a StringBuilder to interop with parameter. In this case you must set a capacity for the CLR to marshal to native code.

Interestingly, for #1 it is almost always done when I am copying data returned from a COM interface into a BCL collection class. So I guess you could say I only ever do this in interop scenarios :).

JaredPar
It can be especially helpful for dictionaries since growing a dictionary is such a costly operation.
Dustin Campbell
But the problem with setting the initial capacity of dictionaries is that you have to take into account the load factor.
Jim Mischel
A: 

Speaking of StringBuilder, I'd dare to use the worst-case size. StringBuilder requires contigous memory block, which is hard to allocate on a highly fragmented heap.

I'd go with an estimation for other collections, though.

arul
StringBuilder isn't special. Every object requires contiguous memory. I dare say you'd have much bigger problems if your heap was so fragmented you couldn't allocate a StringBuilder, unless it's one HUGE string.
Jim Mischel
'Every object requires contiguous memory. ' Are you seriously implying that it's a common case to have objects like several mb's big? Try to allocate a 8mb big stringbuilder and fit 9mb of data into it, most likely it will fail, while 10mb stringbuilder might allocate just fine in the first place.
arul
"Are you seriously implying that it's a common case to have objects like several mb's big?" Why, yes, I am. People regularly work with collections that contain tens of millions of items. 10M references = 40 MB (80 MB on a 64-bit system).
Jim Mischel
The LinkedList<> does not require continuous memory. Correct?
Xaisoft
No, the LinkedList<> does not require contiguous memory.
Jim Mischel