views:

1747

answers:

7

Hello everyone,

I am using VSTS 2008 + .Net 3.5 + C#. I heard the performance of C# List.ToArray is bad since what it does is memory copy for all elements to form a new array. Is that true?

thanks in advance, George

+1  A: 

it creates new references in an array, but that's just the only thing that that method could and should do...

You mean just reference is copied?
George2
+6  A: 

Reasons to call ToArray()

  • If the returned value is not meant to be modified, returning it as an array makes that fact a bit clearer.
  • If the caller is expected to perform many non-sequential accesses to the data, there can be a performance benefit to an array over a List<>.
  • If you know you will need to pass the returned value to a third-party function that expects an array.
  • Compatibility with calling functions that need to work with .NET version 1 or 1.1. These versions don't have the List<> type (or any generic types, for that matter).

Reasons not to call ToArray()

  • If the caller ever does need to add or remove elements, a List<> is absolutely required.
  • The performance benefits are not necessarily guaranteed, especially if the caller is accessing the data in a sequential fashion. There is also the additional step of converting from List<> to array, which takes processing time.
  • The caller can always convert the list to an array themselves.

taken from here

Sorantis
Good refernece, but not direct answer to my question? What is your answer to my question?
George2
It's the only answer we can give: Correctness always trumps performance. You don't the most performant thing you can that's still correct. The application of that is that you don't call .ToArray() unless you have to anyway.
Joel Coehoorn
"...there can be a performance benefit to an array over a List<>." - any evidence for this? Sounds like a myth to me.
Joe
Returning an array doesn't indicate that it can't be modified. The BCL is full of methods that return arrays and the recipient is quite free to modify the array.
Daniel Earwicker
.NET framework prior to 2.0 had non-generic collections, as well as arrays.
Daniel Earwicker
I just want to confirm only reference is copied during ToArray. No re-construction of objects during ToArray?
George2
A: 

Why not download a copy of Reflector and take a look for yourself what it does?

Matt Howells
Reflector is good but the code is complex, often reference here and there in order to understand a simple piece of code. :-)
George2
+11  A: 

No that's not true. Performance is good since all it does is memory copy all elements (*) to form a new array.

Of course it depends on what you define as "good" or "bad" performance.

(*) references for reference types, values for value types.

EDIT

In response to your comment, using Reflector is a good way to check the implementation (see below). Or just think for a couple of minutes about how you would implement it, and take it on trust that Microsoft's engineers won't come up with a worse solution.

public T[] ToArray()
{
    T[] destinationArray = new T[this._size];
    Array.Copy(this._items, 0, destinationArray, 0, this._size);
    return destinationArray;
}

Of course, "good" or "bad" performance only has a meaning relative to some alternative. If in your specific case, there is an alternative technique to achieve your goal that is measurably faster, then you can consider performance to be "bad". If there is no such alternative, then performance is "good" (or "good enough").

EDIT 2

In response to the comment: "No re-construction of objects?" :

No reconstruction for reference types. For value types the values are copied, which could loosely be described as reconstruction.

Joe
Thanks Joe, your answer is so cool! Do you have any related documents to discuss further or prove further of the claim -- "all it does is memory copy all elements (*) to form a new array."?
George2
Thank Joe, Array.Copy only copy reference? No re-construction of objects?
George2
George. Go look it up! Or go use Reflector and find out. It wasn't so complex for ToArray, was it?
John Saunders
Thanks John and Joe! My question is answered.
George2
+1  A: 

Performance has to be understood in relative terms. Converting an array to a List involves copying the array, and the cost of that will depend on the size of the array. But you have to compare that cost to other other things your program is doing. How did you obtain the information to put into the array in the first place? If it was by reading from the disk, or a network connection, or a database, then an array copy in memory is very unlikely to make a detectable difference to the time taken.

Daniel Earwicker
"put into the array in the first place" means?
George2
Prior to copying the array, you must have obtained some information to store in the array, or else there would be no reason to make a copy of it.
Daniel Earwicker
+3  A: 

Yes, it's true that it does a memory copy of all elements. Is it a performance problem? That depends on your performance requirements.

A List contains an array internally to hold all the elements. The array grows if the capacity is no longer sufficient for the list. Any time that happens, the list will copy all elements into a new array. That happens all the time, and for most people that is no performance problem.

E.g. a list with a default constructor starts at capacity 16, and when you .Add() the 17th element, it creates a new array of size 32, copies the 16 old values and adds the 17th.

The size difference is also the reason why ToArray() returns a new array instance instead of passing the private reference.

chris166
Thanks chris166, I just want to confirm only reference is copied during ToArray. No re-construction of objects during ToArray?
George2
Yes, only references are copied. The list doesn't know how to create a deep copy of your objects. The exception are value types (structs, ints, doubles, enums etc).
chris166
A: 

Short answer: No, List<>.ToArray is never a CPU-performance problem.

In my experience, memory copies are rather cheap. If, for instance, you're doing any LINQ or IEnumerable<> stuff, you'll find that the CPU cost of memory copies to be negligible in comparison.

On the other hand, copies do cost memory, which may be scarce.

As a rule of thumb, I entirely disregard the cpu cost of ToArray (even the more expensive LINQ variants) - if you're going to be working with enough data to make ToArray's CPU cost relevant, then you're really doing heavy lifting, and likely the cost of using List in the first place if going to completely swamp ToArray in that scenario.

Eamon Nerbonne