views:

186

answers:

7

Premise

This problem has a known solution (shown below actually), I'm just wondering if anyone has a more elegant algorithm or any other ideas/suggestions on how to make this more readable, efficient, or robust.

Background

I have a list of sports competitions that I need to sort in an array. Due to the nature of this array's population, 95% of the time the list will be pre sorted, so I use an improved bubble sort algorithm to sort it (since it approaches O(n) with nearly sorted lists).

The bubble sort has a helper function called CompareCompetitions that compares two competitions and returns >0 if comp1 is greater, <0 if comp2 is greater, 0 if the two are equal. The competitions are compared first by a priority field, then by game start time, and then by Home Team Name.

The priority field is the trick to this problem. It is an int that holds a positve value or 0. They are sorted with 1 being first, 2 being second, and so on with the exception that 0 or invalid values are always last.
e.g. the list of priorities
0, 0, 0, 2, 3, 1, 3, 0
would be sorted as 1, 2, 3, 3, 0, 0, 0, 0

The other little quirk, and this is important to the question, is that 95% of the time, priority will be it's default 0, because it is only changed if the user wants to manually change the sort order, which is rarely. So the most frequent case in the compare function is that priorities are equal and 0.

The Code

This is my existing compare algorithm.

int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{

    if(comp1.nPriority == comp2.nPriority)
    {
     //Priorities equal
     //Compare start time
     int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs);
     if(ret != 0)
     {
      return ret; //return compare result
     }else
     {
      //Equal so far
      //Compare Home team Name
      ret = comp1.sHLongName.CompareNoCase(comp2.sHLongName);
      return ret;//Home team name is last field to sort by, return that value
     }
    }
    else if(comp1.nPriority > comp2.nPriority)
    {
     if(comp2.nPriority <= 0)
      return -1;
     else
      return 1;//comp1 has lower priority
    }else /*(comp1.nPriority < comp2.nPriority)*/
    {
     if(comp1.nPriority <= 0)
      return 1;
     else
      return -1;//comp1 one has higher priority
    }
}

Question

How can this algorithm be improved?
And more importantly...
Is there a better way to force 0 to the back of the sort order?

I want to emphasize that this code seems to work just fine, but I am wondering if there is a more elegant or efficient algorithm that anyone can suggest. Remember that nPriority will almost always be 0, and the competitions will usually sort by start time or home team name, but priority must always override the other two.

A: 

If you can, it seems like modifying the priority scheme would be the most elegant, so that you could just sort normally. For example, instead of storing a default priority as 0, store it as 999, and cap user defined priorities at 998. Then you won't have to deal with the special case anymore, and your compare function can have a more straightforward structure, with no nesting of if's: (pseudocode)

if (priority1 < priority2) return -1;
if (priority1 > priority2) return 1;

if (startTime1 < startTime2) return -1;
if (startTime1 > startTime2) return 1;

if (teamName1 < teamName2) return -1;
if (teamName1 > teamName2) return -1;

return 0;  // exact match!
Peter Recore
I considered that, but there are database related issues with storing anything other than 0 as the default value, and still having maintainable code at least. I like the no nesting layout though.
CodeFusionMobile
+1  A: 

You can also reduce some of the code verbosity using the trinary operator like this:

int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{

if(comp1.nPriority == comp2.nPriority)
{
    //Priorities equal
    //Compare start time
    int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs);

    return ret != 0 ? ret : comp1.sHLongName.CompareNoCase(comp2.sHLongName);
}
else if(comp1.nPriority > comp2.nPriority)
    return comp2.nPriority <= 0 ? -1 : 1;
else /*(comp1.nPriority < comp2.nPriority)*/
    return comp1.nPriority <= 0 ? 1 : -1;
}

See? This is much shorter and in my opinion easily read.
I know it's not what you asked for but it's also important.

the_drow
On the contrary, I did ask for easier to read as part of the OP. (re-read premise). This is the kind of suggestion I'm looking for.
CodeFusionMobile
+2  A: 

Isn't it just this?

if (a==b) return other_data_compare(a, b);
if (a==0) return 1;
if (b==0) return -1;
return a - b;
MSN
Not quite. You left out the home team name compare. This is also not quite as readable in my opinion but I like the idea.
CodeFusionMobile
Fixed!-ish. .
MSN
I think I'll use the last three lines of this solution at least. I'll wait to see other answers before I accept though.
CodeFusionMobile
A: 

I think the inelegance you feel about your solution comes from duplicate code for the zero priority exception. The Pragmatic Programmer explains that each piece of information in your source should be defined in "one true" place. To the naive programmer reading your function, you want the exception to stand-out, separate from the other logic, in one place, so that it is readily understandable. How about this?

    if(comp1.nPriority == comp2.nPriority)
    {
        // unchanged
    }
    else
    {
     int result, lowerPriority;
     if(comp1.nPriority > comp2.nPriority)
     {
      result = 1;
      lowerPriority = comp2.nPriority;
     }
     else
     {
      result = -1;
      lowerPriority = comp1.nPriority;
     }

     // zero is an exception: always goes last
     if(lowerPriority == 0)
      result = -result;

     return result;
    }
Evan Rogers
+1  A: 

Is it intended that if the case nPriority1 < 0 and nPriority2 < 0 but nPriority1 != nPriority2 the other data aren't compared?

If it isn't, I'd use something like

int nPriority1 = comp1.nPriority <= 0 ? INT_MAX : comp1.nPriority;
int nPriority2 = comp2.nPriority <= 0 ? INT_MAX : comp2.nPriority;

if (nPriority1 == nPriority2) {
   // current code
} else {
   return nPriority1 - nPriority2;
}

which will consider values less or equal to 0 the same as the maximum possible value.

(Note that optimizing for performance is probably not worthwhile if you consider that there are insensitive comparisons in the most common path.)

AProgrammer
+1 creating temporary priorities where 0==INT_MAX makes comparisons much more readable.
nilamo
A: 

I Java-ized it, but the approach will work fine in C++:

int CompareCompetitions(Competition comp1, Competition comp2) {
 int n = comparePriorities(comp1.nPriority, comp2.nPriority);
 if (n != 0)
  return n;
 n = comp1.sStartTime24Hrs.compareToIgnoreCase(comp2.sStartTime24Hrs);
 if (n != 0)
  return n;
 n = comp1.sHLongName.compareToIgnoreCase(comp2.sHLongName);
 return n;
}

private int comparePriorities(Integer a, Integer b) {
 if (a == b)
  return 0;
 if (a <= 0)
  return -1;
 if (b <= 0)
  return 1;
 return a - b;
}

Basically, just extract the special-handling-for-zero behavior into its own function, and iterate along the fields in sort-priority order, returning as soon as you have a nonzero.

Carl Manaster
A: 

As long as the highest priority is not larger than INT_MAX/2, you could do

#include <climits>

const int bound = INT_MAX/2;
int pri1 = (comp1.nPriority + bound) % (bound + 1);
int pri2 = (comp2.nPriority + bound) % (bound + 1);

This will turn priority 0 into bound and shift all other priorities down by 1. The advantage is that you avoid comparisons and make the remainder of the code look more natural.

In response to your comment, here is a complete solution that avoids the translation in the 95% case where priorities are equal. Note, however, that your concern over this is misplaced since this tiny overhead is negligible with respect to the overall complexity of this case, since the equal-priorities case involves at the very least a function call to the time comparison method and at worst an additional call to the name comparator, which is surely at least an order of magnitude slower than whatever you do to compare the priorities. If you are really concerned about efficiency, go ahead and experiment. I predict that the difference between the worst-performing and best-performing suggestions made in this thread won't be more than 2%.

#include <climits>

int CompareCompetitions(const SWI_COMPETITION &comp1,const SWI_COMPETITION &comp2)
{
    if(comp1.nPriority == comp2.nPriority)
       if(int ret = comp1.sStartTime24Hrs.CompareNoCase(comp2.sStartTime24Hrs))
          return ret;
       else
          return comp1.sHLongName.CompareNoCase(comp2.sHLongName);

    const int bound = INT_MAX/2;
    int pri1 = (comp1.nPriority + bound) % (bound + 1);
    int pri2 = (comp2.nPriority + bound) % (bound + 1);
    return pri1 > pri2 ? 1 : -1;
}

Depending on your compiler/hardware, you might be able to squeeze out a few more cycles by replacing the last line with

return (pri1 > pri2) * 2 - 1;

or

return (pri1-pri2 > 0) * 2 - 1;

or (assuming 2's complement)

return ((pri1-pri2) >> (CHAR_BIT*sizeof(int) - 1)) | 1;

Final comment: Do you really want CompareCompetitions to return 1,-1,0 ? If all you need it for is bubble sort, you would be better off with a function returning a bool (true if comp1 is ">=" comp2 and false otherwise). This would simplify (albeit slightly) the code of CompareCompetitions as well as the code of the bubble sorter. On the other hand, it would make CompareCompetitions less general-purpose.

Ari
The reason I don't do anything like this is because as I said 95% of the time priority is 0, which means I would be doing this every time. I didn't mention this in the OP, but I get the values from a database (which NEEDS to store 0 not something like INT_MAX by default) every time I am going to sort.
CodeFusionMobile
You can precede this with `if(comp1.nPriority == comp2.nPriority){ do the other comparisons }`. Then you would only be doing this in the 5% cases that it is actually needed.
Ari
And nobody said anything about storing INT_MAX in the database. It's just used for the comparison. You read in 0 and write out 0.
Ari