tags:

views:

1818

answers:

8

The title speaks for itself ....

Does choice of container affects the speed of the default std::sort algorithm somehow or not? For example, if I use list, does the sorting algorithm just switch the node pointers or does it switch the whole data in the nodes?

+2  A: 

It depends on the element type.

If you're just storing pointers (or POD) then vector will be fastest. If you're storing objects then list's sort will be faster as it will swap nodes and not physical elements.

Andrew Grant
I think the question was for std::sort() only.
wilhelmtell
it was, but only because I wasn't aware of list::sort. I like Stack Overflow, thanks you all :)
+1. But to be clear, std::list does store objects directly inside the node structures -- list::sort() just swaps the values of the node pointers *inside* these nodes.
j_random_hacker
Have you got any numbers to back that up? I don't but in my experience list is *significantly* slower than vector for sorting even large objects. Locality of reference plays a hand as does sorting algorithm and the fact that processors blit copy very fast. Depends on a number of factors though!
MattyT
Nope, just what I've observed over the years! There's likely to be a lot of variation based on what's in the vector - e.g. if the objects were allocated at similar times they memory access patterns will be a lot better! At the end of the day anything that matters should be tested and measured.
Andrew Grant
@MattyT: Well it seems the C++ standard doesn't mention the swapping of node pointers only, so you could be right -- but any decent implementation would do so. (E.g. SGI's STL implementation states that iterators will continue to point to their original elements after sort(), implying this.)
j_random_hacker
+10  A: 

I don't think std::sort works on lists as it requires a random access iterator which is not provided by a list<>. Note that list<> provides a sort method but it's completely separate from std::sort.

The choice of container does matter. STL's std::sort relies on iterators to abstract away the way a container stores data. It just uses the iterators you provide to move elements around. The faster those iterators work in terms of accessing and assigning an element, the faster the std::sort would work.

Mehrdad Afshari
Not sure about the position on the standard on the subject, but algorithms like quicksort don't actually need random access iterators. All they need is a pointer/iterator to the first and last item in the subrange to be sorted. Quicksort works just fine on a doubly-linked list.
Andy Ross
+1  A: 

The sort algorithm knows nothing about your container. All it knows about are random-access iterators. Thus you can sort things that aren't even in a STL container. So how fast it is going to be depends on the iterators you give it, and how fast it is to dereference and copy what they point to.

std::sort won't work on std::list, since sort requires random access iterators. You should use one of std::list's member function sorts for that case. Those member functions will efficiently swap around linked list pointers instead of copying elements.

Brian Neal
std::sort doesn't work with std::list (it requires random access). which is why std::list::sort exists.
I realized that after I typed it, it is now corrected. Thanks.
Brian Neal
+3  A: 

std::list is definitely not a good (valid) choice for std::sort(), because std::sort() requires random-access iterators. std::map and friends are also no good because an element's position cannot be enforced; that is, the position of an element in a map cannot be enforced by the user with insertion into a particular position or a swap. Among the standard containers we're down to std::vector and std::deque.

std::sort() is like other standard algorithms in that it only acts by swapping elements' values around (*t = *s). So even if list would magically support O(1) access the links wouldn't be reorganized but rather their values would be swapped.

Because std::sort() doesn't change the container's size it should make no difference in runtime performance whether you use std::vector or std::deque. Primitive arrays should be also fast to sort, probably even faster than the standard containers -- but I don't expect the difference in speed to be significant enough to justify using them.

wilhelmtell
since when are std::map and std::set not ordered?
MartinStettner
map and set are already ordered by definition.
Brian Neal
what i meant is that the order is specified by the container, not by the user.
wilhelmtell
The order is determined not by the container, but by operator< for the type in the map or set.
Brian Neal
.. or by the comp parameter. but you get my point. std::sort is meaningless over std::map's variants.
wilhelmtell
Agreed, you can't even call it on them.
Brian Neal
+1. I think the explanation is clear enough.
j_random_hacker
A: 

It surely does matter, just because different containers have different memory access patterns etc. which could play a role.

However, std::sort doesn't work on std::list<>::iterators as these are not RandomAccessIterators. Moreover, although it would be possible to implement a specialization for std::list<> that would shuffle the nodes' pointers, it would probably have strange and surprising semantic consequences - eg. if you have an iterator inside sorted range in a vector, its value will change after the sorting, which would not be true with this specialization.

jpalecek
+7  A: 

The choice does make a difference, but predicting which container will be the most efficient is very difficult. The best approach is to use the container that is easiest for your application to work with (probably std::vector), see if sorting is adequately fast with that container, and if so stick wth it. If not, do performance profiling on your sorting problem and choose different container based on the profile data.

As an ex-lecturer and ex-trainer, I sometimes feel personally responsible for the common idea that a linked list has mystical performance enhancing properties. Take it from one who knows: the only reason a linked list appear in so many text books and tutorials is because it is covenient for the people who wrote those books and tutorials to have a data structure that can illustrate pointers, dynamic memory mangement, recursion, searching and sorting all in one - it has nothing to do with efficiency.

anon
+1, good common sense argument. Yes, for 90% of cases you'll encounter in the real world, vectors will be more efficient than linked lists -- computers just *like* arrays, that's all. :)
j_random_hacker
A: 

std::sort requires random access iterators, so your only options to use that are vector or deque. It will swap the values, and at a guess vector will probably perform slightly faster than deque because it typically has a simpler underlying data structure. The difference is likely very marginal though.

If you use a std::list, there is a specialisation (std::list::sort) which should swap the pointers rather than the values. However because it's not random access it'll use mergesort instead of quicksort, which will probably mean that the algorithm itself is a little slower.

Anyway, I think the answer is normally vector. If you have large classes for each element so copying overhead dominates the sorting process, list might beat it. Or alternatively you could store pointers to them in a vector and supply a custom predicate to sort them appropriately.

Peter
Quicksort can be implemented efficiently using linked lists (briefly, each list element is moved into one of two new lists, which are later concatenated with the pivot appearing between them). Also, raw arrays can be sorted with std::sort().
j_random_hacker
A: 

Vector.

Always use vector as your default. It has the lowest space overheads and fastest access of any other container (among other advantages like C-compatible layout and random-access iterators).

Now, ask yourself - what else you doing with your container? Do you need strong exception guarantees? List, set and map are likely to be better options (though they all have their own sort routines). Do you need to regularly add elements to the front of your container? Consider deque. Does your container need to always be sorted? Set and map are likely to be a better fit.

Finally, figure out specifically what "best" is for you and then choose the most appropriate container and measure how it performs for your needs.

MattyT