tags:

views:

197

answers:

4

I'm representing an infinitely precise integer as an array of unsigned ints for processing on a GPU. For debugging purposes I'd like to print the base 10 representation of one of these numbers, but am having difficulty wrapping my head around it. Here's what I'd like to do:

//the number 4*(2^32)^2+5*(2^32)^1+6*(2^32)^0
unsigned int aNumber[3] = {4,5,6};
char base10TextRepresentation[50];
convertBase2To32ToBase10Text(aNumber,base10TextRepresentation);

Any suggestions on how to approach this problem?

Edit: Here's a complete implementation thanks to drhirsch

#include <string.h>
#include <stdio.h>
#include <stdint.h>

#define SIZE 4

uint32_t divideBy10(uint32_t * number) {
  uint32_t r = 0;
  uint32_t d;
  for (int i=0; i<SIZE; ++i) {
    d = (number[i] + r*0x100000000) / 10;
    r = (number[i] + r*0x100000000) % 10;
    number[i] = d;
  }
  return r;
}

int zero(uint32_t* number) {
  for (int i=0; i<SIZE; ++i) {
    if (number[i] != 0) {
      return 0;
    }
  }
  return 1;
}

void swap(char *a, char *b) {
  char tmp = *a;
  *a = *b;
  *b = tmp;
}

void reverse(char *str) {
  int x = strlen(str);
  for (int y = 0; y < x/2; y++) {
    swap(&str[y],&str[x-y-1]);
  }
}

void convertTo10Text(uint32_t* number, char* buf) {
  int n = 0;
  do {
    int digit = divideBy10(number);
    buf[n++] = digit + '0';
  } while(!zero(number));
  buf[n] = '\0';
  reverse(buf);
}

int main(int argc, char** argv) {
  uint32_t aNumber[SIZE] = {0,0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF};
  uint32_t bNumber[4] = {1,0,0,0};

  char base10TextRepresentation[50];

  convertTo10Text(aNumber, base10TextRepresentation);
  printf("%s\n",base10TextRepresentation);
  convertTo10Text(bNumber, base10TextRepresentation);
  printf("%s\n",base10TextRepresentation);
}
+1  A: 

If it is .NET, take a look at this implementation of a BigInteger class.

Konamiman
Thanks, this looks good. I'll check in detail.
Rich
+3  A: 

If you have access to 64 bit arithmetic, it is easier. I would do something along the line of:

int32_t divideBy10(int32_t* number) {
    uint32_t r = 0;
    uint32_t d;
    for (int i=0; i<SIZE; ++i) {
        d = (number[i] + r*0x100000000) / 10;
        r = (number[i] + r*0x100000000) % 10;
        number[i] = d;
        number[i] = r;
}

void convertTo10Text(int32_t* number, char* buf) {
    do {
        digit = divideBy10(number);
        *buf++ = digit + '0';
    } while (!isEqual(number, zero));
    reverse(buf);
}

isEqual() and reverse() left to be implemented. divideBy10 divides by 10 and returns the remainder.

drhirsch
Thanks, this worked great!
Rich
+1  A: 

Fundamentally you need classic decimal printing using digit production by dividing your number by ten (in your base 2^32) repeatedly and using the remainder as digits. You may not have a divide by (anything, let alone) 10 routine, which is probably the key source of your problem.

If you are working in C or C++, you can get a complete infinite precision arithmetic package from GNU Bignum package. Most other widely used languages have similar packages available.

Of course, if you have too much free time, you can always implement multiprecision division yourself. You're already borrowing terminology from Knuth; he also supplies the multiprecision algorithms in Seminumerical Algorithms.

Ira Baxter
That is the problem. Unfortunately I do have to implement all the multi-precision stuff myself as I want to offload it to the GPU. Implementing division seems like a good next step.
Rich
Involving the GPU in infinite precision division for the purpose of printing numbers is a poor use of the GPU; how many decimal printable numbers is it going to produce in one second under normal circumstances? Best to leave this to conventional software.
Ira Baxter
You imply you are implementing infinite precision arithmetic on the GPU. Assuming it isn't for decimal conversion procedure, why are you doing this? What does the GPU bring to that problem, and why do you expect it to scale well?
Ira Baxter
Sorry, that's not what I meant to imply. Printing the number is only for debugging off the GPU. On the GPU I am implementing a primality tester for arbitrary precision integers which is an embarrassingly parallel problem.
Rich
Aha. So are you testing one number at a time in parallel, or lots of numbers in parallel? I'm not a GPU expert, but I didn't think they handled operations that varied considerably in time-to-execute well, because you have in effect a barrier synch at the "end" of the current set of operations, and Amhdahl's law means inefficient speedups in such a case. I'd expect infinite-precision arithmetic ops to have wide variance in timing for multiply/divide/modulo.What am I missing?
Ira Baxter
I'm checking one number at a time, with ranges of possible divisors in parallel. In preliminary tests with 64 bit integers I was getting an ~11x speed-up compared to a single CPU core, so I think it's promising. I was able to program the GPU kernel so it didn't need a synch in the 64 bit side. You're right though, that's not always an option.
Rich
A: 

How about using long doubles? Then you get 80bits in the mantissa, but I guess that the accuracy is lost when using floating point numbers.

Fred