ansaurus

Question

Faster way of finding multiple of double

Answer 1

A:

Does it need to be double precision ? Depending on how good your math library is, this ought to be faster:

#include <math.h>

#define TOLERANCE 0.0001f

bool IsMultipleOf(float x, float mod)
{
    return(fabsf(fmodf(x, mod)) < TOLERANCE);
}

Paul R 2010-06-21 14:16:45

fmodf() is *slower* than fmod().

Hans Passant 2010-06-21 14:25:39

@Hans Passant: that will depend on what CPU and math library you are using - in general it *should* be faster.

Paul R 2010-06-21 15:04:20

@Hans Passant: I just tested this on a Core i7, compiling with Intel ICC 11, and the single precision version with `fmodf` is around 25% faster. **YMMV** of course.

Paul R 2010-06-21 15:23:08

Answer 2

A:

I presume modulo looks a little like this on the inside:

mod(x,m) {
  while (x > m) {
    x = x - m
  }
  return x
}

I think that through some sort of search i could be optimised: eg:

fastmod(x,m) {
  q = 1

  while (m * q < x) {
    q = q * 2
  }

  return mod((x - (q / 2) * m), m)
}

You might even choose to replace the finall call to mod with annother call to fastmod, adding the condition that if x < m then to return x.

thomasfedb 2010-06-21 14:17:06

I wrote all this in C-like pesudocode by the way.

thomasfedb 2010-06-21 14:18:17

Answer 3

+3 A:

Do you really have to use modulo for this?

Wouldn't it be possible to just result = x / mod and then check if the decimal part of result is close to 0. For instance:

11 / 5.4999 = 2.000003  ==> 0.000003 < TOLERANCE

Or something like that.

Martin Wickman 2010-06-21 14:24:22

pretty sure that's exactly what `modulo` or `fmod` does (: only it may do it a *smarter* way depending on the CPU (it may be a single instruction for example to calculate the division and modulus together)

drfrogsplat 2010-06-21 14:54:12

@drfrogsplat - My assumption was that fmod and division were going to have similar performance, but this is something I should obviously test.

Shane MacLaughlin 2010-06-21 15:21:56

Answer 4

+1 A:

Division (floating point or not, fmod in your case) is often an operation where the execution time varies a lot depending on the cpu and compiler:

gcc has a builtin replacement for
that if you give it the right compile flags or if you use __builtin_fmod explicitly. This then might map the operation on a small number of assembler instructions.
there may be special units like SSE on intel processors where this operation is implemented more efficiently

By such tricks, depending on your environment (you didn't tell which) the time may vary from some clock cycles to some hundred. I think best is to look into the documentation of your compiler and cpu for that particular operation.

Jens Gustedt 2010-06-21 14:55:10

SSE would only help if you want to perform several divisions at once (actually I'm not even sure if SSE instructions include division).The second 'S' is for 'SIMD' = Single Instruction Multiple Data, i.e. perform 8 additions at once, so I think this'd only help if one were to be searching for valid 'mod' values by testing a large list of them (i.e. you could test 8 at once, perhaps)...

drfrogsplat 2010-06-21 15:00:03

No, this S in there is only for historical reasons. You have some instructions that are faster if you use them just for one operator. But the downside is really that there is not one simple rule for it. You have to dig through manuals, realize different implementations and benchmark the whole to know what is possible.

Jens Gustedt 2010-06-21 15:15:08

Answer 5

+1 A:

The following is probably overkill, and sub-optimal. But for what it is worth here is one way on how to do it.

We know the format of the double ...

1 bit for the sign
11 bits for the biased exponent
52 fraction bits

Let ...

value = x / mod;
exp = exponent bits of value - BIAS;
lsb = least sig bit of value's fraction bits;

Once you have that ...

/*
 * If applying the exponent would eliminate the fraction bits
 * then for double precision resolution it is a multiple.
 * Note: lsb may require some massaging.
 */
if (exp > lsb)
    return (true);

if (exp < 0)
    return (false);

The only case remaining is the tolerance case. Build your double so that you are getting rid of all the digits to the left of the decimal.

sign bit is zero (positive)
exponent is the BIAS (1023 I think ... look it up to be sure)
shift the fraction bits as appropriate

Now compare it against your tolerance.

Sparky 2010-06-21 15:21:06

Answer 6

+1 A:

I think you need to inspect the bowels of your C RTL fmod() function: X86 FPU's have 'FPREM/FPREM1' instructions which computes remainders by repeated subtraction.

While floating point division is a single instruction, it seems you may need to call FPREM repeatedly to get the right answer for modulus, so your RTL may not use it.

Roddy 2010-06-21 15:45:48

Answer 7

A:

I have not tested this at all, but from the way I understand fmod this should be equivalent inlined, which might let the compiler optimize it better, though I would have thought that the compiler's math library (or builtins) would work just as well. (also, I don't even know for sure if this is correct).

#include <math.h>

int IsMultipleOf(double x, double mod) {
    long n = x / mod;  // You should probably test for /0 or NAN result here
    double new_x = mod * n;
    double delta = x - new_x;
    return fabs(delta) < TOLERANCE;  // and for NAN result from fabs
}

nategoose 2010-06-21 21:10:30

Answer 8

+1 A:

Maybe you can get away with long long instead of double if you have comparable scale of data. For example long long would be enough for over 60 astronomical units in micrometer resolution.

Tometzky 2010-06-22 12:51:49

ansaurus

tags:

views:

answers:

Faster way of finding multiple of double

related questions