views:

715

answers:

9

I am writing a key record look up where the I have an index between the key and the rec number. This is sorted on the key. Is there away to do this better that what I have for speed optimization?

typedef struct
{
    char key[MAX_KEYLEN];
    int  rec;
} KeyRecPair;

typedef struct
{
    KeyRecPair *map;
    int         numRecs;
} KeyRecMap;

int GetRecFromKey(char *key, KeyRecMap *theMap)
{
    int cmpValue, bottom = 0;
    int half = theMap->numRecs / 2;
    int top = theMap->numRecs - 1;

    while (bottom != top)
    {
     cmpValue = strncmp(key, theMap->map[half].key, MAX_KEY_LEN); 

     if (cmpValue > 0)
     {
      /*top stays*/
      bottom = half + 1;
      half  = bottom + (top - bottom) / 2;
      continue;
     }
     if (cmpValue < 0)
     {
      /*bottom stays*/
      top = half - 1;
      half  = bottom + (top - bottom) / 2;
      continue;
     }
     return theMap->map[half].rec;
    }

    if (0 == strncmp(key, theMap->map[half].key, MAX_KEY_LEN))
     return theMap->map[half].rec;
    return 0;
}
+4  A: 

The bsearch library function performs a binary search over an array, given a suitable compare function you implement. Being a library function, it is well-optimised and (hopefully) bug-free.

Greg Hewgill
I think this function is as optimised as possible.
Georg
Note, though, that the cost of passing a function (etc.) might outweigh the gains from the optimized bsearch, in the same way that C's qsort is often slower than C++'s std::sort.
ShreevatsaR
A: 

The only potential optimization that I can think of is to use something similar to golden ratio in calculating half instead of dividing the remaining subset to two halves with equal number of elements, that is

        if (cmpValue > 0)
        {
                /*top stays*/
                bottom = half + 1;
                half = bottom + (top - bottom) * 3 / 5;
                continue;
        }
        if (cmpValue < 0)
        {
                /*bottom stays*/
                top = half - 1;
                half = bottom + (top - bottom) * 2 / 5;
                continue;
        }
dmityugov
Golden ratio will not divide equal half. (Bottom+top)/2 only will divide exactly half. Golden division is required when the set is of natural growth such as monthly Petal counts, bacteria cells etc.
lakshmanaraj
A: 

Instead of dividing by 2 U can make advantage of bit shift operator.

=> for /2 u can use >> 1

lakshmanaraj
Any half-decent compiler will do this automatically. You should only use shifting when you *mean* shifting, i.e. when working with the bits, as in masks etc. When you mean "make this number half as big", just divide and let the compiler sort it out.
unwind
General rule of such microoptimizations: if your compiler doesn't do them anyway, it's screwing up your performance to the point that microoptimizations won't help.
David Thornley
A: 

Since you're going to have to compute half once per loop, why not just do that once, just before it is used? That would cut two complex-looking (at least, relatively speaking) lines of code.

unwind
I believe modern compilers can merge such duplicate code automaticaly
dmityugov
+2  A: 

Instead of using a binary search to locate the item, a hash map might be more suitable because it has O(1) lookup characteristics. However that might be slow with load of collisions with a naive approach. However this paper describes a way to create a hashmap like tree that has O(log(n) / log(32)) access time which generally outperforms normal hashmap implementations. (The fixed aray + linked list implementation).

Jasper Bekkers
+3  A: 

A good chunk of your time will be spent in the strncmp.

I suggest forcing that to be inlined, or rewriting it inline, to avoid the function call over head.

If you are feeling brave it may be possible to unroll the loop once or twice and see a performance gain.

If your string was actually a fixed length of array of char, you could make the length a multiple of 4 and and compare 4 bytes at a time with an unsigned int compare, instead of 1 byte at a time.

If you don't have a profiler, you should get one. Profilers make it easy to see what the relative costs of various implementations are.

Another option would be to pick a different way to organize your data. Check out AVL trees for inspiration. Choosing some sort of Hashing function, like the others mentioned, may be a viable option

EvilTeach
+1  A: 

Any chance you could use a key that isn't a string? or at least shortest possible strings? (what is MAX_KEYLEN's value) that strcmp every iteration of the loop likely is one of the slower parts of the search.

Evan Teran
+1  A: 

Is there a reason for wanting to optimize this? Have you run the program with a profiler and determined that the lookup takes a significant part of the total runtime? Are you just curious about how fast you can get it? (Either are, in my opinion, good reasons.) If you are just randomly optimizing for the heck of it, don't.

Also, remember to benchmark. It's hard to tell which of two versions of a function are faster on a modern system (it was easier on my Z80). How many cache misses may or may not be more important than the number of branches wrongly predicted.

David Thornley
A: 

Although I would expect a decent optimizer to do this for you, I'd put theMap->map into a local so that it has half a chance of ending up in a register instead of dereferencing it on each access. Again, I'd expect the optimizer to do this for you, so you might also want to check the assembly output.

I looked at visual studio 2008's output in release and it does a pretty good job on the code. For example, the comparison code looks like this:

; 30   :         if (cmpValue > 0)
test eax, eax
jle SHORT $LN11@GetRecFrom
; 31   :         {
; omitted inner block for > case.
$LN11@GetRecFrom:
; 37   :         if (cmpValue < 0)
jge SHORT $LN2@GetRecFrom

basically, it's a branch to branch without retesting cmpValue. Nice touch.

There is a slight benefit to putting theMap->map into a local, but it's tiny. If MAX_KEY_LEN is not a nice multiple of 4 and structs aren't padded, you should definitely put the int first in your struct.

plinth