ansaurus

Question

What is the fastest possible way to sort an array of 7 integers?

Answer 1

+13 A:

For a very small set, insertion sort can usually beat quicksort because it has very low overhead.

WRT your edit, if you're already mostly in sort order (last 5 elements are already sorted), insertion sort is definitely the way to go. In an almost-sorted set of data, it'll beat quicksort every time, even for large sets. (Especially for large sets! This is insertion sort's best-case scenario and quicksort's worst case.)

Mason Wheeler 2009-09-12 13:57:39

Answer 2

+7 A:

Don't know how you are implementing this, but what you could do is have an array of 52 instead of 7, and just insert the card in its slot directly when you get it since there can never be duplicates, that way you never have to sort the array. This might be faster depending on how its used.

JRL 2009-09-12 14:02:40

In fact it could even be an array of bits, which would fit in single 64-bit integer and consume less memory.

Filip Navara 2009-09-12 14:07:30

Not really. Instead of trying to look through a 7-element array, he'll instead look through 52-element array. Same for bits within a single integer.

Pavel Shved 2009-09-12 14:11:42

@Pavel Shved: yes it would mean looking at a 52 element array - but this array would never have to be sorted, you would get that for free. That's why I said it *might* be faster depending on how the code is.

JRL 2009-09-12 14:25:21

Avoiding the sort may be the best way to speed up the program.

Nosredna 2009-09-12 15:01:03

(+1) Just to be fancy. Array implementation of binary tree that can be traversed in in-order can be used. http://en.wikipedia.org/wiki/Binary_tree

TheMachineCharmer 2009-09-12 15:59:10

@Pavel Shved, storing it as bits will not make the sort faster, but it may be better representation overall for the program. Bit operations are generally much faster than accessing the 7 element array and it can easily fit into processor registers. -- Actually on second thought, it may make the sort faster since "find first set" is natively implemented on almost any processor, so even creating the bit array and reconstructing the normal array could be VERY FAST.

Filip Navara 2009-09-12 16:03:35

Actually, if you do have to sort all 7 then this is the fastest way, particulary if you know how to use scatter-gather bit-masks to do the re-collection (reduces the scan order from 52 to 7+7). However, if 5 are already sorted then the fastest is to sort the other two (O(2.5)) and do a merge on the two lists (O(k*(5+2)) where k is usually around 4 or 5.

RBarryYoung 2009-09-12 20:58:07

Answer 3

A:

Take a look at this:

http://en.wikipedia.org/wiki/Sorting%5Falgorithm

You would need to pick one that will have a stable worst case cost...

Another option could be to keep the array sorted the whole time, so an addition of a card would keep the array sorted automatically, that way you could skip to sorting...

Heiko Hatzfeld 2009-09-12 14:04:24

Answer 4

+1 A:

For 7 elements, there are only few options. You can easily write a generator that produces method to sort all possible combinations of 7 elements. Something like this method for 3 elements:

if a[1] < a[2] {
    if a[2] < a[3] {
        // nothing to do, a[1] < a[2] < a[3]
    } else {
         if a[1] < a[3] {
             // correct order should be a[1], a[3], a[2]
             swap a[2], a[3]
         } else {
             // correct order should be a[3], a[1], a[2]
             swap a[2], a[3]
             swap a[1], a[3]
         }
    }
} else {
    // here we know that a[1] >= a[2]
    ...
}

Of course method for 7 elements will be bigger, but it's not that hard to generate.

Peter Štibraný 2009-09-12 14:07:07

"will be bigger". Hmm. 3! is 6, 7! is 5040.

Steve Jessop 2009-09-12 14:37:13

That's why I suggested generating it, and not writing manually :-) And it will always use minimum number of comparisons and swap operations.

Peter Štibraný 2009-09-12 14:52:43

...it will be very bad on processor cache.

Filip Navara 2009-09-12 16:05:07

I don't know much about processor instruction cache to argue :-), but as you can see, deeper you go, more local the jumps are (at least in pseudocode ;-)), so after few cache misses (few 'else' branches), rest should be cached already. But it depends on how code is compiled down to instructions.

Peter Štibraný 2009-09-12 20:07:24

5040 is smaller than 2^13, so you should ideally be able to find the right permutation with 13 binary branches. And 5040 elements should easily fit into a modern CPU's cache.

nikie 2009-09-12 20:56:11

Answer 5

A:

What JRL is referring to is a bucket sort. Since you have a finite discrete set of possible values, you can declare 52 buckets and just drop each element in a bucket in O(1) time. Hence bucket sort is O(n). Without the guarantee of a finite number of different elements, the fastest theoretical sort is O(n log n) which things like merge sort an quick sort are. It's just a balance of best and worst case scenarios then.

But long answer short, use bucket sort.

Tesserex 2009-09-12 14:07:44

As I already noted, bucket sort is not O(n), but O(n+m), where m is the number of values the elements in the array given can have. That's not applicable in this case.

Pavel Shved 2009-09-12 14:18:31

Asymptotic performance is utterly irrelevant to the question, which fixes n at 7.

Steve Jessop 2009-09-12 14:35:21

And if you represent your buckets using bits, you've the same sort as some of the other answers here.

Pete Kirkham 2009-09-12 21:16:47

Answer 6

+2 A:

Use min-sort. Search for minimal and maximal element at once and place them into resultant array. Repeat three times. (EDIT: No, I won't try to measure the speed theoretically :_))

var
cards,result: array[0..6] of integer;
i,min,max: integer;

begin
   n=0;
   while (n<3) do begin
      min:=-1;
      max:=52;
      for i from 0 to 6 do begin
          if cards[i]<min then min:=cards[i]
          else if cards[i]>max then max:=cards[i]
      end
      result[n]:=min;
      result[6-n]:=max;
      inc(n);
   end
   for i from 0 to 6 do 
       if (cards[i]<52) and (cards[i]>=0) then begin
           result[3] := cards[i];
           break;
       end
    { Result is sorted here! }
end

Pavel Shved 2009-09-12 14:17:01

Answer 7

A:

If you like the above mentioned suggestion to keep a 52 element array which always keeps your array sorted, then may be you could keep another list of 7 elements which would reference the 7 valid elements in the 52 element array. This way we can even avoid parsing the 52 element array.

I guess for this to be really efficient, we would need to have a linked list type of structure which be supports operations: InsertAtPosition() and DeleteAtPosition() and be efficient at that.

Trainee4Life 2009-09-12 14:53:21

Answer 8

+4 A:

There are only 5040 permutations of 7 elements. You can programmaticaly generate a program that finds the one represented by your input in a minimal number of comparisons. It will be a big tree of if-then-else instructions, each comparing a fixed pair of nodes, for example if (a[3]<=a[6]).

The tricky part is deciding which 2 elements to compare in a particular internal node. For this, you have to take into account the results of comparisons in the ancestor nodes from root to the particular node (for example a[0]<=a[1], not a[2]<=a[7], a[2]<=a[5]) and the set of possible permutations that satisfy the comparisons. Compare the pair of elements that splits the set into as equal parts as possible (minimize the size of the larger part).

Once you have the permutation, it is trivial to sort it in a minimal set of swaps.

Rafał Dowgird 2009-09-12 14:57:56

Answer 9

+1 A:

The code below is close to optimal. It could be made better by composing a list to be traversed while making the tree, but I'm out of time right now. Cheers!

object Sort7 {
  def left(i: Int) = i * 4
  def right(i: Int) = i * 4 + 1
  def up(i: Int) = i * 4 + 2
  def value(i: Int) = i * 4 + 3

  val a = new Array[Int](7 * 4)
  def reset = {
    0 until 7 foreach { 
      i => {
        a(left(i)) = -1
        a(right(i)) = -1
        a(up(i)) = -1
        a(value(i)) = scala.util.Random.nextInt(52)
      }
    }
  }

  def sortN(i : Int) {
    var index = 0
    def getNext = if (a(value(i)) < a(value(index))) left(index) else right(index)
    var next = getNext
    while(a(next) != -1) {
      index = a(next)
      next = getNext
    }
    a(next) = i
    a(up(i)) = index
  }

  def sort = 1 until 7 foreach (sortN(_))

  def print {
    traverse(0)
    def traverse(i: Int): Unit = {
      if (i != -1) {
        traverse(a(left(i)))
        println(a(value(i)))
        traverse(a(right(i)))
      }
    }
  }
}

Daniel 2009-09-12 15:51:06

Answer 10

+1 A:

In pseudo code:

int64 temp = 0;
int index, bit_position;

for index := 0 to 6 do
   temp |= 1 << cards[index];

for index := 0 to 6 do
begin
   bit_position = find_first_set(temp);
   temp &= ~(1 << bit_position);
   cards[index] = bit_position;
end;

It's an application of bucket sort, which should generally be faster than any of the comparison sorts that were suggested.

Note: The second part could also be implemented by iterating over bits in linear time, but in practice it may not be faster:

index = 0;
for bit_position := 0 to 51 do
begin
   if (temp & (1 << bit_position)) > 0 then
   begin
      cards[index] = bit_position;
      index++;
   end;
end;

Filip Navara 2009-09-12 16:10:30

Answer 11

A:

There are a lot of loops in the answers. Given his speed requirement and the tiny size of the data set I would not do ANY loops.

I have not tried it but I suspect the best answer is a fully unrolled bubble sort. It would also probably gain a fair amount of advantage from being done in assembly.

I wonder if this is the right approach, though. How are you going to analyze a 7 card hand?? I think you're going to end up converting it to some other representation for analysis anyway. Would not a 4x13 array be a more useful representation? (And it would render the sorting issue moot, anyway.)

Loren Pechtel 2009-09-12 16:32:01

Answer 12

A:

Considering that last 5 elements are always sorted:


for i := 0 to 1 do begin
  j := i;
  x := array[j];
  while (j+1 <= 6) and (array[j+1] < x) do begin
    array[j] := array[j+1];
    inc(j);
  end;
  array[j] := X;
end;

2009-09-12 19:09:04

Answer 13

A:

bubble sort is your friend. Other sorts have too many overhead codes and not suitable for small number of elements

Cheers

2009-09-12 20:04:48

Answer 14

+4 A:

Since the last 5 items are already sorted, the code can be written just to reposition the first 2 items. Since you're using Pascal, I've written and tested a sorting algorithm that can execute 2,118,760 times in about 62 milliseconds.

procedure SortT7Cards(var Cards: T7Cards);
const
  CardsLength = Length(Cards);
var
  I, J, V: Integer;
  V1, V2: Integer;
begin
  // Last 5 items will always be sorted, so we want to place the first two into
  // the right location.
  V1 := Cards[0];
  V2 := Cards[1];
  if V2 < V1 then
  begin
    I := V1;
    V1 := V2;
    V2 := I;
  end;

  J := 0;
  I := 2;
  while I < CardsLength do
  begin
    V := Cards[I];
    if V1 < V then
    begin
      Cards[J] := V1;
      Inc(J);
      Break;
    end;
    Cards[J] := V;
    Inc(J);
    Inc(I);
  end;
  while I < CardsLength do
  begin
    V := Cards[I];
    if V2 < V then
    begin
      Cards[J] := V2;
      Break;
    end;
    Cards[J] := V;
    Inc(J);
    Inc(I);
  end;
  if J = (CardsLength - 2) then
  begin
    Cards[J] := V1;
    Cards[J + 1] := V2;
  end
  else if J = (CardsLength - 1) then
  begin
    Cards[J] := V2;
  end;
end;

Jon Benedicto 2009-09-12 20:09:33

Answer 15

+2 A:

This is the fastest method: since the 5-card list is already sorted, sort the two-card list (a compare & swap), and then merge the two lists, which is O(k * (5+2). In this case (k) will normally be 5: the loop test(1), the compare(2), the copy(3), the input-list increment(4) and the output list increment(5). That's 35 + 2.5. Throw in loop initialization and you get 41.5 statements, total.

You could also unroll the loops which would save you maybe 8 statements or execution, but make the whole routine about 4-5 times longer which may mess with your instruction cache hit ratio.

Given P(0 to 2), C(0 to 5) and copying to H(0 to 6) with C() already sorted (ascending):

If P(0) > P(1) Then
    // Swap:
    T = P(0)
    P(0) = P(1)
    P(1) = T
    // 1stmt + (3stmt * 50%) = 2.5stmt
End

P(2), C(5) = 53    \\ Note these are end-of-list flags
k = 0     \\ P() index
J = 0     \\ H() index
i = 0     \\ C() index
// 4 stmt

Do While (j) < 7 
    If P(k) < C(I) then
        H(j) = P(k)
        k = k+1
    Else
        H(j) = C(i)
        j = j+1
    End if
    j = j+1
    // 5stmt * 7loops = 35stmt
Loop

And note that this is faster than the other algorithm that would be "fastest" if you had to truly sort all 7 cards: use a bit-mask(52) to map & bit-set all 7 cards into that range of all possible 52 cards (the bit-mask), and then scan the bit-mask in order looking for the 7 bits that are set. That takes 60-120 statements at best (but is still faster than any other sorting approach).

RBarryYoung 2009-09-12 20:10:00

Answer 16

+1 A:

Assuming that you need an array of cards at the end of it.

Map the original cards to bits in a 64 bit integer ( or any integer with >= 52 bits ).

If during the initial mapping the array is sorted, don't change it.

Partition the integer into nibbles - each will correspond to values 0x0 to 0xf.

Use the nibbles as indices to corresponding sorted sub-arrays. You'll need 13 sets of 16 sub-arrays ( or just 16 sub-arrays and use a second indirection, or do the bit ops rather than looking the answer up; what is faster will vary by platform ).

Concatenate the non-empty sub-arrays into the final array.

You could use larger than nibbles if you want; bytes would give 7 sets of 256 arrays and make it more likely that the non-empty arrays require concatenating.

This assumes that branches are expensive and cached array accesses cheap.

#include <stdio.h>
#include <stdbool.h>
#include <stdint.h>

// for general case of 7 from 52, rather than assuming last 5 sorted
uint32_t card_masks[16][5] = {
    { 0, 0, 0, 0, 0 },
    { 1, 0, 0, 0, 0 },
    { 2, 0, 0, 0, 0 },
    { 1, 2, 0, 0, 0 },
    { 3, 0, 0, 0, 0 },
    { 1, 3, 0, 0, 0 },
    { 2, 3, 0, 0, 0 },
    { 1, 2, 3, 0, 0 },
    { 4, 0, 0, 0, 0 },
    { 1, 4, 0, 0, 0 },
    { 2, 4, 0, 0, 0 },
    { 1, 2, 4, 0, 0 },
    { 3, 4, 0, 0, 0 },
    { 1, 3, 4, 0, 0 },
    { 2, 3, 4, 0, 0 },
    { 1, 2, 3, 4, 0 },
};

void sort7 ( uint32_t* cards) {
    uint64_t bitset = ( ( 1LL << cards[ 0 ] ) | ( 1LL << cards[ 1LL ] ) | ( 1LL << cards[ 2 ] ) | ( 1LL << cards[ 3 ] ) | ( 1LL << cards[ 4 ] ) | ( 1LL << cards[ 5 ] ) | ( 1LL << cards[ 6 ] ) ) >> 1;

    uint32_t*   p    = cards;
    uint32_t    base = 0;

    do {
        uint32_t* card_mask = card_masks[ bitset & 0xf ];

        // you might remove this test somehow, as well as unrolling the outer loop
        // having separate arrays for each nibble would save 7 additions and the increment of base
        while ( *card_mask )
            *(p++) = base + *(card_mask++);

        bitset >>= 4;
        base += 4;
    } while ( bitset );
}

void print_cards ( uint32_t* cards ) {
    printf ( "[ %d %d %d %d %d %d %d ]\n", cards[0], cards[1], cards[2], cards[3], cards[4], cards[5], cards[6] );
}

int main ( void ) {
    uint32_t cards[7] = { 3, 9, 23, 17, 2, 42, 52 };

    print_cards ( cards );
    sort7 ( cards );
    print_cards ( cards );

    return 0;
}

Pete Kirkham 2009-09-12 20:45:20

Answer 17

+4 A:

I don't know that much about Texas Hold'em: Does it matter what suit P1 and P2 are, or does it only matter if they are of the same suit or not? If only suit(P1)==suit(P2) matters, then you could separate the two cases, you have only 13x12/2 different possibilities for P1/P2, and you can easily precalculate a table for the two cases.

Otherwise, I would suggest something like this:

(* C1 < C2 < P1 *)
for C1:=0 to P1-2 do 
   for C2:=C1+1 to P1-1 do 
      Cards[0] = C1;
      Cards[1] = C2;
      Cards[2] = P1;
      (* generate C3...C7 *)

(* C1 < P1 < C2 *)
for C1:=0 to P1-1 do 
   for C2:=P1+1 to 51 do 
      Cards[0] = C1;
      Cards[1] = P1;
      Cards[2] = C2;
      (* generate C3...C7 *)

(* P1 < C1 < C2 *)
for C1:=P1+1 to 51 do 
   for C2:=C1+1 to 51 do 
      Cards[0] = P1;
      Cards[1] = C1;
      Cards[2] = C2;
      (* generate C3...C7 *)

(this is just a demonstration for one card P1, you would have to expand that for P2, but I think that's straightforward. Although it'll be a lot of typing...) That way, the sorting doesn't take any time at all. The generated permutations are already ordered.

nikie 2009-09-12 20:53:20

By far the most useful answer, even though it doesn't address the actual question.By implementing two tables (one for flush combos, on for other combos), each taking about 8Mb of memory, the procedure went fram about 16 seconds to 3,5. Way to go! And the best part is that the initialization of the tables was also very fast, since I could discard information about suit.(for the curious, the tables are gPokerCombosWithFlush : array[0..12,0..12,0..12,0..12,0..12] of TPokerCombo; gPokerCombosWithoutFlush : array[0..12,0..12,0..12,0..12,0..12] of TPokerCombo;)

Svein Bringsli 2009-09-12 23:27:46

Did you even try insertion sort?

FogleBird 2009-09-12 23:41:27

@FogleBird: If you create the combinations in ordered in the first place, you don't have to sort at all. Even insertion sort can't beat O(0) ;-)

nikie 2009-09-13 07:32:41

Answer 18

A:

Here is your basic O(n) sort. I'm not sure how it compares to the others. It uses unrolled loops.

char card[7]; // the original table of 7 numbers in range 0..51
char table[52]; // workspace

// clear the workspace
memset(table, 0, sizeof(table));

// set the 7 bits corresponding to the 7 cards
table[card[0]] = 1;
table[card[1]] = 1;
...
table[card[6]] = 1;

// read the cards back out
int j = 0;
if (table[0]) card[j++] = 0;
if (table[1]) card[j++] = 1;
...
if (table[51]) card[j++] = 51;

Mike Dunlavey 2009-09-12 21:13:03

Answer 19

A:

For seven numbers, the most efficient algorithm that exists with regards to the number of comparisons is Ford-Johnson's. In fact, wikipedia references a paper, easily found on google, that claims Ford-Johnson's the best for up to 47 numbers. Unfortunately, references to Ford-Johnson's aren't all that easy to found, and the algorithm uses some complex data structures.

It appears on The Art Of Computer Programming, Volume 3, by Donald Knuth, if you have access to that book.

There's a paper which describes FJ and a more memory efficient version here.

At any rate, because of the memory overhead of that algorithm, I doubt it would be worth your while for integers, as the cost of comparing two integers is rather cheap compared to the cost of allocating memory and manipulating pointers.

Now, you mentioned that 5 cards are already sorted, and you just need to insert two. You can do this with insertion sort most efficiently like this:

Order the two cards so that P1 > P2
Insert P1 going from the high end to the low end
(list) Insert P2 going from after P1 to the low end
(array) Insert P2 going from the low end to the high end

How you do that will depend on the data structure. With an array you'll be swapping each element, so place P1 at 1st, P2 and 7th (ordered high to low), and then swap P1 up, and then P2 down. With a list, you just need to fix the pointers as appropriate.

However once more, because of the particularity of your code, it really is best if you follow nikie suggestion and just generate the for loops apropriately for every variation in which P1 and P2 can appear in the list.

For example, sort P1 and P2 so that P1 < P2. Let's make Po1 and Po2 the position from 0 to 6, of P1 and P2 on the list. Then do this:

Loop Po1 from 0 to 5
Loop Po2 from Po1 + 1 to 6
If (Po2 == 1) C1start := P2 + 1; C1end := 51 - 4
If (Po1 == 0 && Po2 == 2) C1start := P1+1; C1end := P2 - 1
If (Po1 == 0 && Po2 > 2) C1start := P1+1; C1end := 51 - 5
If (Po1 > 0) C1start := 0; C1end := 51 - 6
for C1 := C1start to C1end
  // Repeat logic to compute C2start and C2end
  // C2 can begin at C1+1, P1+1 or P2+1
  // C2 can finish at P1-1, P2-1, 51 - 3, 51 - 4 or 51 -5
  etc

You then call a function passing Po1, Po2, P1, P2, C1, C2, C3, C4, C5, and have this function return all possible permutations based on Po1 and Po2 (that's 36 combinations).

Personally, I think that's the fastest you can get. You completely avoid having to order anything, because the data will be pre-ordered. You incur in some comparisons anyway to compute the starts and ends, but their cost is minimized as most of them will be on the outermost loops, so they won't be repeated much. And they can even be more optimized at the cost of more code duplication.

Daniel 2009-09-13 06:09:28

ansaurus

tags:

views:

answers:

What is the fastest possible way to sort an array of 7 integers?

related questions