views:

437

answers:

3

I do next:

void Foobar(string[] arr, Dictionary<string, string[]>)
{
   var t = arr.Intersect(dic.Keys).ToList(); // .or ToArray() ?
   foreach(var item in t)
   {
      ..
   }

   var j = t.Count; // also I need this
}

which method is preferred?

I could go without any but I need to know the size and I don't want to call IEnuramble.Count() - it seems do do more actions then Array<T>.Size or List<T>.Count. Am I right?

+10  A: 

The difference is probably so small that it is worth just using the method that fits your needs better. Smells of micro-optimization.

And in this case, since all you are doing is enumerating the set and counting the set (both of which you can do with an IEnumerable), why not just leave it as an IEnumerable<>?

Yaakov Ellis
I guess this leads naturally to another question: since performance is for most intents and purposes the same, which type should I use as my "default" collection type? I personally prefer `List<T>` because it's not read-only in length, but I've had trouble in the past convincing others that it makes a better "default" collection type choice than `T[]`.
romkyns
+8  A: 

If you are really concerned about performance, you should loop over the IEnumerable and count it as you go. This avoids having to create a new collection altogether, and the intersection only has to be iterated once:

void Foobar(string[] arr, Dictionary<string, string[]>)
{
   var t = arr.Intersect(dic.Keys);
   int count = 0;
   foreach(var item in t)
   {
      count++;
      ..
   }

   var j = count;
}

But like someone else said: this smells of micro-optimization. If performance really matters in this situation, at least do performance profiling to find out which method is really the fastest for you.

Greg
But if you are that concerned about performance, then this means that you have to update the counter variable X times for an IEnumerable that has X items - compared to one lookup for Count, which may be more efficient for a big collection.
Yaakov Ellis
@Yaakov: Something will have to Count the size of the collection. By counting yourself, you only have to iterate over the collection once. If the collection is converted into an array of list, the collection must be interated at least twice (once for the conversion, and once for the `foreach` loop.
Greg
According to reflector, List<>, ArrayList<> and Array each hold a length or size variable within the object that is referenced when doing a straight count - so running Count on an Array or List would not cause another enumeration.
Yaakov Ellis
@Yaakov: No, the point is that creating the array or the list from the enumearation would cause an iteration for creation. Then you would have to iterate through for whatever processing is being done on the created array or list. Instead, just enumerate over the collection counting and processing as you go.
Jason
+5  A: 

Actually, in the current MS implementation of Count(IEnumerable) there's a shortcut looking if the IEnumerable is an ICollection and calls Count on it. So the performance should be comparable for counting elements.

ToList and ToArray are a bit the same. If the IEnumerable is a ICollection, then the CopyTo method is called instead, which is a bit faster.

So, choose what makes your code the most readable, and benchmark for YOUR use case to have a definite answer.

Update: I did a naive benchmark.

Starting with an Array: var items = Enumerable.Range(1,1000).ToArray();

  • calling ToList() : 25ms / 10000
  • calling ToArray() : 23 ms / 10000

Starting with an IEnumerable: var items = Enumerable.Range(1,1000);

  • calling ToList() : 168ms / 10000
  • calling ToArray() : 171 ms / 10000

So basically you get comparable performance.

Yann Schwartz