ansaurus

Question

What makes this bucket sort function slow?

Answer 1

+1 A:

Linked Lists are not arrays. They are substantially slower to perform operations like lookup. The STL sort may well have a specific version for lists that takes this into account and optimizes for it- but your function blindly ignores what container it's using. You should try using an STL vector as your array.

DeadMG 2010-10-17 15:08:54

The STL list is used just as bucket and only push_back front and pop_front is used,which should takes const time.And the only sort is insertionsort(A) where A is actually array of double type.

luoq 2010-10-17 15:20:30

@luoq: You are calling `size()` on the lists, which is O(N) rather than O(1).

Oli Charlesworth 2010-10-17 15:49:37

@Oli Charlesworth : should be O(1) in good implementation. Also this he has as much bucket as the number of elements in the array. Each list should be of length ~1 if he distribution is uniform.

Loïc Février 2010-10-17 15:57:37

@Oli Charlesworth: The complexity of size is "Constant (recommended). Linear in some implementations.".So this may make the function slow.But even if it's O(n),the total extra time should be O(size of A).Whatever, I'll try not use size.

luoq 2010-10-17 16:00:21

@luoq : you can use a `while(buckets[i].empty())` ;)

Loïc Février 2010-10-17 16:02:15

@Loïc Février :I'm working on it

luoq 2010-10-17 16:09:39

@lucq See my answer. It's of course `while(!buckets[i].empty())`...

Loïc Février 2010-10-17 16:12:18

Using while(!buckets[i].empty()) instead of size() the time is roughly the same.It seems a little faster with large n.

luoq 2010-10-17 16:17:45

@Loic: O(1) `size()` conflicts with O(1) `splice()`, so maybe, maybe not. GNU implementation is an example with O(N) time (it calls `std::distance()`.

Oli Charlesworth 2010-10-17 16:22:22

@Oli Charlesworth : don't see any conflicts. Splice works with iterator, should not need size(). Where is the conflict ? Note that the third version of slice is not O(1). See http://www.cplusplus.com/reference/stl/list/splice/

Loïc Février 2010-10-17 16:32:05

@Loic: Precisely. Some implementers choose O(1) `size`, some choose O(1) `slice(iterator, x, first, last)`. MSVC went for the former, GNU for the latter.

Oli Charlesworth 2010-10-17 16:36:47

Alright. There was no indication that slice could be O(1) in some implementation.

Loïc Février 2010-10-17 16:43:02

@Loïc Février: check out the SGI docs, they document function complexity. If you look at the "new members" section, all versions of `splice`, including the "range" version are documented as "This function is constant time.". To be constant time, one must not iterate over that range! See http://www.sgi.com/tech/stl/List.html

André Caron 2010-10-17 16:57:38

@André Caron : Thanks !

Loïc Février 2010-10-17 17:11:55

Answer 2

+1 A:

With

iarray<List> buckets(numBuckets);

you are basically creating a LOT of lists and that can cost you a lot especially in memory access which it theoretically linear but that's not the case in practice.

Try to reduce the number of buckets.

To verify my assertion analyse your code speed with only the creation of the lists.

Also to iterate over the elements of the lists you should not use .size() but rather

//get back from buckets
for(size_t i=0,head=0;i!=numBuckets;i++)
  while(!buckets[i].empty())
  {
    A[head++] = buckets[i].front();
    buckets[i].pop_front();
  }

In some implementations .size() can be in O(n). Unlikely but...

After some research I found this page explaining what is the code for std::_List_node_base::hook.

Seems it is only to insert an element at a given place in a list. Shouldn't cost a lot..

Loïc Février 2010-10-17 16:08:26

size() seems constant in my enviroment(GCC),I'll try your first idea

luoq 2010-10-17 16:19:45

Even if constant, the "correct" way to take all elements from a list is with empty/front/pop.

Loïc Février 2010-10-17 16:33:12

It cannot explain that much time.I run the function with just initializing buckets(keep only first two lines),from n=1000 to 4096000 ,the runtime is 2%-5% of original.

luoq 2010-10-17 16:35:47

Ok. Try that + "put in buckets" and not "get back". To see in which lines is the problem.

Loïc Février 2010-10-17 16:37:07

I will try using array to store the size of buckets and direct copy the array to a new one according to the size information.Thus using of linked list is avoid.I will check the run time then.

luoq 2010-10-17 16:39:52

@Loïc Février:Direcr and good idea

luoq 2010-10-17 16:41:08

@lucq : since we do not know what is std_List_node_base_M_hook, a simple thing to do would be to see where it is used.

Loïc Février 2010-10-17 16:44:15

That's it.After adding "put in buckets" the time is in fact longer than doing the complete due to maybe some random factor.

luoq 2010-10-17 16:54:40

@luoq : ok, but that does not explains why. No cache in memory access maybe ? Have you tried push_front ? Because pop() does not seems to take much time, it should cost you as much than push_back.

Loïc Février 2010-10-17 17:09:51

It's std_List_node_base::_M_hook instead of std_List_node_base_M_hook,which is defined(or declared) at line 87 of /usr/include/c++/4.5.1/bits/stl_list.h with GCC 4.5.1. But I cannot understand that file.

luoq 2010-10-17 17:10:48

See the update to my answer. You have the code of the function. Basically you have a double-linked list and that function update next/prev the elements when a new one is inserted.

Loïc Février 2010-10-17 17:13:25

push_front dosen't make much difference.The differnce between push and pop may be that push requires allocation of memory which somewhat does not takes linear time.I will try use vector as bucker and reserve some space fot each.

luoq 2010-10-18 00:14:19

Using vector as buckets make the code faster ,it seems O(n) now

luoq 2010-10-18 00:40:58

@luoq: that's normal, `list` is the slowest container in the STL. It's only useful for its iterator invalidation property (and for time to time the `slice` operation). Don't use it if possible.

Matthieu M. 2010-10-18 07:23:28

Answer 3

+1 A:

I think perhaps the interesting question is, Why are you creating an inordinately large number of buckets?

Consider the input {1,2,3}, numBuckets = 3. The loop containing buckets[int(numBuckets*A[i])].push_back(A[i]); is going to unroll to

buckets[3].push_back(1);  
buckets[6].push_back(2);  
buckets[9].push_back(3);

Really? Nine buckets for three values...

Consider if you passed a permutation of the range 1..100. You'd create 10,000 buckets and only use 1% of them. ... and each of those unused buckets requires creating a List in it. ... and has to be iterated over and then discarded in the readout loop.

Even more exciting, sort the list 1..70000 and watch your heap manager explode trying to create 4.9 billion Lists.

Eric Towers 2010-10-18 02:10:05

The content of array are generate randomly in [0,1).So just numBuckets buckets are created and all element can be put in.

luoq 2010-10-18 04:47:23

Answer 4

+1 A:

In my opinion, the biggest bottleneck here is memory management functions (such as new and delete).

Quicksort (of which STL probably uses an optimized version) can sort an array in-place, meaning it requires absolutely no heap allocations. That is why it performs so well in practice.

Bucket sort relies on additional working space, which is assumed to be readily available in theory (i.e. memory allocation is assumed to take no time at all). In practice, memory allocation can take anywhere from (large) constant time to linear time in the size of memory requested (Windows, for example, will take time to zero the contents of pages when they are allocated). This means standard linked list implementations are going to suffer, and dominate the running time of your sort.

Try using a custom list implementation that pre-allocates memory for a large number of items, and you should see your sort running much faster.

casablanca 2010-10-18 03:00:44

I have tried using vector as buckets(use push_back pop_back,and reserve space for two double),the code runs faster than using list but putting in buckets also consume most time.The problem is that some bucket will have larger content .If pre-allocating that for each bucket will waste a lot of memory and time.And for now I donnot kown the distribution of size of the most large bucket.

luoq 2010-10-18 04:57:26

Those are exactly the conditions that bucket sort requires to perform well: it assumes you have enough extra space readily available.

casablanca 2010-10-18 05:26:52

And that is also the reason why bucket sort is impractical for large data sets.

casablanca 2010-10-18 05:33:51

ansaurus

tags:

views:

answers:

What makes this bucket sort function slow?

related questions