ansaurus

Question

Answer 1

+13 A:

This avoids looping:

int tmp = a % b;
if (tmp < 0) tmp += b;

Notice that both a and b need to be signed.

Tronic 2010-02-24 03:39:01

+1, the result of x = a % b, must be |x|<b.

Ninefingers 2010-02-24 03:42:53

`int tmp = a % b; return tmp + b * (tmp < 0);` Is faster.

LiraNuna 2010-02-24 03:48:21

There should be no difference when you use an optimizing compiler (but sometimes compilers miss things).

Tronic 2010-02-24 03:49:35

I'm testing this live with GCC 4.4.1 with -O3, there IS a difference. Don't give compilers too much credit, else I won't be here asking this question :)

LiraNuna 2010-02-24 03:50:46

@LiraNura: What processor? Can you give an assembly dump?

Anon. 2010-02-24 03:54:15

you might even try `return a % b + b * (a < 0);` since b is guaranteed to be unsigned (untested and unprofiled)

cobbal 2010-02-24 03:59:36

This doesn't give the same result as the OP's original. Eg with `a = -10` and `b = 3`, the original gives 2 but this gives 0.

caf 2010-02-24 04:04:25

(-10) % 3 gives -1, add 3 and you have 2. How did you get 0 instead?

Tronic 2010-02-24 04:08:54

Using multiplication might be slower than a (single!) conditional statement. It depends on the processor. I've been developing for the arm7TDMI lately - conditional statements (Which can be done without branching) use 1 extra clock cycle each, where as multiplication uses 3.

Wallacoloo 2010-02-24 04:10:23

`(-10) % 3U` does not give -1.

caf 2010-02-24 04:10:43

Good point, you obviously need to use signed modulus there to avoid wrapping around. Added a note about that.

Tronic 2010-02-24 04:13:49

updated version `umod(-10, 3) == 2`: `return a % (int)b + b * (a < 0);` although this will lose half of the range of b.

cobbal 2010-02-24 04:16:12

+1 for mentioning that this will lose half range of b, which will defy the purpose of declaring b _unsigned_.

legends2k 2010-02-24 06:00:22

Some anonymous coward downvoting without comment :(

Tronic 2010-02-25 07:38:28

Answer 2

+1 A:

In your original function, you could have returned after the while loop finished for negative numbers, thus skipping the mod. This is in the same spirit, replacing the loop with a multiply - although it could be made to have fewer characters...

unsigned int umod2(int a, unsigned int b)
{
    return (a < 0) ? a + ((-a/b)+1)*b : a % b;
}

Here's the loop version:

unsigned int umod2_works(int a, unsigned int b)
{
    if (a < 0)
    {
        while (a < 0)
            a += b;
        return a;
    } else {
        return a % b;
    }
}

Both have been tested to match the OP's original function.

mtrw 2010-02-24 03:56:04

Sorry, I meant to say 'replacing the loop with a *divide and* multiply.' I'm not sure which is faster, but this is at least correct.

mtrw 2010-02-24 03:57:38

Doesn't make it go faster, you're adding another branch, thus slowing the routine down. I got even distribution of numbers as input.

LiraNuna 2010-02-24 04:11:53

I'm surprised that another branch is slower than a mod. Anyway, Alok's answer looks like the nicest so far.

mtrw 2010-02-24 04:17:30

It's not another branch, if it's written properly.

Stephen Canon 2010-02-24 05:12:11

Answer 3

A:

If a and b are both much smaller than an int, then you can just add a sufficiently large multiple of b to every value before you mod.

unsigned umod(int a, unsigned b)
{
    return (unsigned)(a + (int)(b * 256)) % b;
}

Of course, this trick doesn't work if a + (b * 256) can overflow, but for a lot of the uses I can see for this code, you can be certain that it never will.

John Knoeller 2010-02-24 04:00:01

Answer 4

+1 A:

In a % b, if any of the operands is unsigned both are converted to unsigned. This means that if a is negative, you get a modulo UINT_MAX + 1 value instead of a. If UINT_MAX+1 is evenly divisible by b, then things are fine, and you can just return a % b. If not, you have do do the modulo in int type.

unsigned int umod(int a, unsigned int b)
{
    int ret;
    if (a >= 0) return a % b;
    if (b > INT_MAX) return a + b;
    ret = a % (int)b;
    if (ret < 0) ret += b;
    return ret;
}

Edit: Updated, but you should use caf's answer as that's simpler (or maybe not?!). This is here for the record.

Alok 2010-02-24 04:07:54

What if `b` is outside of the range of `int`?

caf 2010-02-24 04:13:12

Good point. In that case you want `a + b` I think.

Alok 2010-02-24 04:15:09

Answer 5

+9 A:

This should do it:

unsigned umod(int a, unsigned b)
{
    if (a < 0)
    {
        unsigned r = (-a % b);
        if (r)
            return b - r;
        else
            return 0;
    }
    else
        return a % b;
}

Tested to match original. Limitation is that a > INT_MIN on 2s complement machines.

caf 2010-02-24 04:09:33

So obvious. +1.

Alok 2010-02-24 04:16:01

How can `a` be < `INT_MIN`?

Alok 2010-02-24 04:18:49

that's a greater than sign...

Wallacoloo 2010-02-24 04:19:44

+1. Good answer even if somewhat cryptic :)

Tronic 2010-02-24 04:22:31

@wallacoloo: He changed it from `>= INT_MIN` to `> INT_MIN`. Now it's correct.

Alok 2010-02-24 04:22:46

Dammit, missed an edge case - updated to fix but it's nowhere near as clean anymore.

caf 2010-02-24 04:29:14

That's pretty weird - since for the `a >= 0` case it should be executing exactly the same instructions at least. How are you getting gprof to profile the inline functions, btw?

caf 2010-02-24 05:02:01

FWIW, in my testing (gcc 4.3.2, -O3) my version runs about 6 to 7 times faster than the loop version, with an even spread of `a` values between -1073741823 and 1073741824. With the `a` values all positive, it runs the same.

caf 2010-02-24 05:08:02

In my case, it's okay for `0 == b`, so I have changed this to be `return b - (-a % b);` for one less branch.

LiraNuna 2010-02-26 00:09:53

Answer 6

+6 A:

Using the ~ :)

unsigned umod(int a, unsigned b)
{
    if (a<0) return b-1-~a%b;
    return a%b;
}

The % has higher precedence than -

If it's ok to return b instead of 0 when -a is a multiple of b you can save some ops

unsigned umod(int a, unsigned b)
{
    if (a<0) return b - (-a % b);
    return a%b;
}

slightly golfed version :)

unsigned umod(int a, unsigned b)
{
return(a<0)?b-(-a%b):a%b;
}

Here is the resulting assembly

1    .globl umod3
2       .type   umod3, @function
3    umod3:
4    .LFB3:
5       .cfi_startproc
6       testl   %edi, %edi
7       js      .L18
8       movl    %edi, %eax
9       xorl    %edx, %edx
10      divl    %esi
11      movl    %edx, %eax
12      ret
13      .p2align 4,,10
14      .p2align 3
15   .L18:
16      movl    %edi, %eax
17      xorl    %edx, %edx
18      negl    %eax
19      divl    %esi
20      subl    %edx, %esi
21      movl    %esi, %edx
22      movl    %edx, %eax
23      ret

gnibbler 2010-02-24 04:32:14

Hm you seem very pegged on this golf think.. get it.. pegged.. ;)

Filip Ekberg 2010-02-24 15:40:18

This has less ops than caf's answer, especially if the negataion of `a` is a simple -a ;) (resulting in `return b - (-a % b);`)

LiraNuna 2010-02-25 23:19:02

Yeah that was my original answer too, before I added the extra conditional to make it precisely match the behaviour of the code in the question. By the way, it is interesting that `gcc` creates an apparently useless `movl` instruction at the end of the function - I was seeing that too (it could just do `movl %esi, %eax`). I wonder if there is some subtle architectural reason for that?

caf 2010-02-26 00:48:39

@caf, apparently the divl puts the result of the mod into edx (unless it is zero). The calling convention requires it to be shifted to eax

gnibbler 2010-02-26 01:13:28

It does, but I am talking about the last two mnemonics before `ret` (lines `21` and `22`). Those lines move the final result from `%esi` to `%edx`, then from `%edx` to `%eax` - that could be done with just one move from `%esi` to `%eax`

caf 2010-02-26 03:15:50

Answer 7

A:

Other than the while loop, Not sure whether the % operation can be optimized as a whole, but the optimization can happen on the pattern of the values for a & b.

If in those 21495808 times the operation is executed.

If the chances of passing a value for a which is less than b ( a < b ) is atleast half of the that. Adding the following statement will definetly improve the overall performance of the function.

if ( abs(a) < b ) // not specifically the abs function, can be your own implementation.
    return 0;
else
    return a%b;

If b is a power of 2 for atleast 80% of the cases, we can use bitwise operators as in

return ( abs(a) & (b-1) );

If the numbers are expected to be anything less than that, it would degrade the performance, as we need to verify whether b is power of 2 [ even after using bitwise operators for the same ] for everything.

Even the functionality to achieve abs(a) can be optimized using bitwise operators, with their own limitations, but is faster than verifying whether a is < 0.

n = (a ^ (a >> 31)) - (a >> 31); // instead of n = a < 0 ? -a : a;

There would be more such things, if you can explore.

Narendra N 2010-02-24 05:22:31

% is cheap compared to the while loop which is non-deterministic

Clifford 2010-02-24 05:49:55

@Clifford, you are right, Did I say anything converse of that?

Narendra N 2010-02-24 06:22:28

Not really, but the first sentence makes it look like you are targetting the % for optimisation rather than the loop.

Clifford 2010-02-24 12:35:38

By the time I saw this one, there were other answers handling that loop part and no answer was accepted, hence I thought I will take a look at the % part. I will change the answer mentioning it.

Narendra N 2010-02-24 14:05:48

Answer 8

+1 A:

int temp;

temp= (a > 0)? ( a % b ) :   b -( (-a) % b ) ;

code below:

int main()
{
int a;
unsigned b;
int temp;
printf("please enter an int and a unsigned number\n");
scanf("%d",&a);
scanf("%u",&b);
modulus(a,b);
temp= (a > 0)? ( a % b ) :   b -( (-a) % b ) ;
printf("\n temp is %d", temp);
return 0;
}
void modulus(int x,unsigned y)
{
int c;
if(x>0)
{
c=x%y;
printf("\n%d\n",c);}
else
{
while(x<0)
x+=y;
printf("\n%d\n",x);}
}


./a.out
please enter an int and a unsigned number
-8 3

1

 temp is 1

Vijay Sarathi 2010-02-24 05:22:47

+1 for good manners

gnibbler 2010-02-24 08:37:58

I did not understand...could you please tell the good manners i showed over here?

Vijay Sarathi 2010-02-24 10:30:59

The "please" in printfs

Nicolás 2010-02-24 16:17:05

Answer 9

+2 A:

Portable edition, still with only one division, no branching, and no multiplication:

unsigned umod(int a, unsigned b) {
    int rem = a % (int) b;
    return rem + (-(rem < 0) & b);
}

Chris Jester-Young 2010-02-24 05:28:30

How is this better than doing a divide and a conditional in C code? Also, might this need clobber args? And maybe `__asm__ __volatile__`?

asveikau 2010-02-24 05:36:19

@asveikau: No, no clobbering because of the constraints. Also, no volatile needed, as the compiler's ordering will guarantee the right results. You can do conditional in C code, if you can ensure no branching. (My portable version does that.)

Chris Jester-Young 2010-02-24 05:40:45

I was thinking you might need to clobber `cc` because of the `FLAGS` register. This has bit me sometimes while doing atomic ops with inline asm. At any rate I like the C version better.

asveikau 2010-02-24 05:52:32

surely you can #ifdef the asm version though

gnibbler 2010-02-24 08:25:04

@gnibbler: You can, but why would you? The C version generates code (under gcc) that's even better than the assembly version I wrote.

Chris Jester-Young 2010-02-24 14:34:25

oh +1 then!!!!!111

gnibbler 2010-02-24 19:22:07

Answer 10

+4 A:

Since the looping version seems to be quite fast, lets try eliminating the division :)

unsigned umod(int a, unsigned b){
    while(a>0)a-=b;
    while(a<0)a+=b;
    return a;
}

gnibbler 2010-02-24 06:12:52

That's a nice optimization for the case when `a` and `b` don't differ much. For the case when, say, `a = MAX_INT`, `b = 2` the code would be terribly slow.Anyway, +1 for non-standard approach.

Vlad 2010-02-24 08:50:16

@Vlad, Since LiraNuna says the original looping version does quite well in the real world tests, I wondered if this might be worth trying also, since we don't know the distribution of a and b

gnibbler 2010-02-24 08:57:40

simply because you don't understand division?

GregS 2010-02-25 01:15:47

@GregS, are you serious?

gnibbler 2010-02-25 02:08:55

Answer 11

+1 A:

Here's one that works over the whole range of unsigned without branching, but it uses multiplications and 2 divisions

unsigned umod(int a, unsigned b)
{
    return (a>0)*a%b+(a<0)*(b-1-~a%b);
}

gnibbler 2010-02-24 10:44:35

Answer 12

A:

My preferred solution is to mod twice. I haven't tried this in C/C++ or with unsigned, but my test cases work in Java:

((a % b) + b) % b

The benefit is no branching and simplicity. The downside is the double mod. I haven't compared performance, but my understanding is that it is branching that hurts performance these days.

JodaStephen 2010-05-12 14:24:19

I'm surprised no one else came out with it... I will have to test it out!

LiraNuna 2010-05-12 18:58:52

ansaurus

tags:

views:

answers:

Unsigned modulos: alternative approach?

related questions