views:

100

answers:

6

Hi,

I've implemented this search algorithm for an ordered array of integers. It works fine for the first data set I feed it (500 integers), but fails on longer searches. However, all of the sets work perfectly with the other four search algorithms I've implemented for the assignment.

This is the function that returns a seg fault on line 178 (due to an unexpected negative m value). Any help would be greatly appreciated.

CODE:

155 /* perform Algortihm 'InterPolationSearch' on the set
156  * and if 'key' is found in the set return it's index
157  * otherwise return -1 */
158 int
159 interpolation_search(int *set, int len, int key)
160 {
161   int l = 0;
162   int r = len - 1;
163   int m;
164
165   while (set[l] < key &&  set[r] >= key)
166   {
167
168     printf ("m = l + ((key - set[l]) * (r - l)) / (set[r] - set[l])\n");
169
170     printf ("m = %d + ((%d - %d) * (%d - %d)) / (%d - %d);\n", l, key, set[l], r, l, set[r], set[l]);
171     m = l + ((key - set[l]) * (r - l)) / (set[r] - set[l]);
172     printf ("m = %d\n", m);
173
174 #ifdef COUNT_COMPARES
175     g_compares++;
176 #endif
177
178     if (set[m] < key)
179       l = m + 1;
180     else if (set[m] > key)
181       r = m - 1;
182     else
183       return m;
184   }
185
186   if (set[l] == key)
187     return l;
188   else
189     return -1;
190 }

OUTPUT:

m = l + ((key - set[l]) * (r - l)) / (set[r] - set[l])
m = 0 + ((68816 - 0) * (100000 - 0)) / (114836 - 0);
m = -14876

Thankyou!

Rhys

+1  A: 

Your arithmetic is probably overflowing the size of int on your platform.

You need to do one of two things. Either use a wider integer type (if available), or recast your calculation so that you don't need to create such a large intermediate value.

Charles Bailey
+5  A: 

68816 * 100000 is more than 2^31, which is very probably the limit of your machine's int data type. You're experiencing integer overflow.

If your compiler supports it, try changing to long long. You can check by running

#include <stdlib.h>
printf("the long long type is %u bits", (unsigned int) (CHAR_BIT * sizeof (long long)));

As Naveen pointed out, you will also need to make sure the actual calculations are made in this precision. This can be done by casting.

unwind
Thanks, but even when I make m a long long (it is 64 bits), the exact same thing happens. Do I have to make casts within the algebra?
Rhys
@Rhys: You need to change your statement to something like this: `m = l + ((long long)(key - set[l]) * (r - l)) / (set[r] - set[l]);` Notice that `long long` inside the first parenthesis.
Naveen
A: 

You will need to be careful of integer overflow. It would be better to perform the calculation of m as a type larger than int, and then cast back to int when you're finished.

Additionally, you may need to be careful if your set contains duplicates, as you may get a division by zero error at the same line (i.e. when set[r] == set[l]).

Simon Nickerson
On most 32-bit platforms `sizeof(long) == sizeof(int)`.
KennyTM
Yes, sorry, been using Java recently and my C is a little rusty.
Simon Nickerson
+1  A: 
m = 0 + ((68816 - 0) * (100000 - 0)) / (114836 - 0);

68816 * 100000 = 6881600000 = (binary)110011010001011001110001000000000

That's 33 bits. On virtually all platforms int are 32 bits or (in rarer cases) 16 bits.

You can try using long long which is guaranteed to be at least 64 bits (added in C99 but most compilers support it in C90 too).

Andreas Bonini
A: 

That is because the intermediate value of the calcualtion (68816 - 0) * (100000 - 0) exceeds the value that could be held inside an int. You can cast the intermediate value to a datatype which can hold the bigger number (for example a long long) to solve the problem

Naveen
Thanks, I didn't realise that casts could be made within algebra like that... I'll have to do more research into this. I'd already tried chaning m to a long long.Greatly appreciated.
Rhys
A: 

To avoid problems like data overflows, you can additionally use a big numbers library. A good example is: http://gmplib.org/ . Of course it will add a little more overhead, but the overall performance is very good.

Andrei Ciobanu