tags:

views:

173

answers:

6

I have a List<MyStruct> that I'm initializing to be empty, and I will be populating this struct in a loop as I parse the data. I know that there is a maximum possible number of entries that will be inserted into this list. For now lets say 1000. However after my parsing of the 1000 entries I may end up only putting 2 into the List. So should I initialize the List with a capacity of 1000 or don't specify a capacity and just add the few entries. It could however end up adding all 1000. What's the best way performance wise?

+12  A: 

Doesn't really matter. Don't micro-optimize. Only set the capacity if you have a good idea it's roughly the amount you need. Under the hood, the list doubles every time it grows, so the number of growths is O(log(n)). It should be pretty efficient.

Mehrdad Afshari
A: 

You don't need to ask us this question. This is easy enough to test for yourself.

Randy Minder
Asking people who may have experience is not an unacceptable method of discovery :P
Joel Etherton
That's a comment, not an answer.
Jerry Coffin
A: 

Probably the best thing to do is compromise. Initialize the list to something like 256.

ChaosPandion
+7  A: 

If it can truly vary that widely, then you would want to not set the capacity. For most collections, the capacity doubles as it is met (with a default capacity of 16 I believe), so your capacity will very closely approach your maximum as you fill it up.

Nick
A: 

Considering that you're list is small to begin with, you're better off not initializing it. It'll make the code easier to read without any noticeable performance hit.

Dustin E
I would have to say it is debatable whether it makes the code more readable. There really isn't much difference. `new List<MyStruct>()` vs `new List<MyStruct>(64)`
ChaosPandion
@ChaosPandion: But that is less maintainable.
Jason
A: 

First, you should just implement it in the most natural, maintainable and readable way. In this case, that is to just create a new List<T> (accepting the default capacity) and add your objects to it. Then what you do if your application is not meeting your performance specifications is you profile it. If it turns out through profiling that this is a bottleneck in your application then you try to optimize it. If your application is meeting your performance specifications or if this specific part is not a bottleneck, you ignore it.

Second, sometimes implementation details are important and here's a case where it is. The way that List<T> is implemented is a dynamically growable array the starts with a certain capacity and doubles the size each time that regrowth is needed. What this means is that if you are adding n object into a newly created list there will be O(log n) regrowths and you'll waste at most O(n) space. Unless memory is tight on your system (maybe you're running .NET CF on a mobile phone) this isn't that big of a deal. And from a performance perspective, the parsing of your entries is likely to consume significantly more time than the regrowth. Thus, this is also not likely to be a factor.

Jason