views:

74

answers:

3

Hello all,

I've got a bit of a problem. In order to grow in my knowledge of C, I've decided to try to implement a basic bigint library.

The core of the bigint structure will be an array of 32 bit integers, chosen because they will fit in a register. This will allow me to do operations between digits that will overflow in a 64 bit integer (which will also fit in a register, as I'm on x86-64), and I can bitshift out each part of the result. I've implemented basic addition, and to test that it is working, I have to print the array. For my own testing purposes, it's fine if I use printf() and output each digit in hex. I can read that just fine.

However, most people can't read hex. As the number is stored in (essentially) base 2^32, printing is a bit of a problem. What would be a good way to convert to base 10?

EDIT:

This deals not with knowing how to convert from base to base, but about a good way to implement this. I was thinking along the lines of making another bigint with another base with conversion for printing.

A: 

The normal way of repeatedly dividing by 10 is obviously going to be painfully slow.

An obvious quick way is to have precomputed arrays of bigints corresponding to the value of each digit in each position. You can then do binary search and relatively cheap comparisons/subtractions to find the ms digit and then each digit in turn.

You could revert to division by 10 when you get down to the last 32 (or 64) bits.

Dipstick
You're proposing to cache each bigint? There are bazillions of them.
recursive
@recursive- not every possible bigint. Just every possible bigint where all digits are zero except for one. So, if the largest bigint would result in a 28-digit number in base 10, you would be caching 280 values.
bta
@crhisharris: For most typical bigint uses, the cost of repeatedly dividing by 10 is trivial compared with more complex operations such as multiplying two bignums (or worse, multiplying a matrix of them), or testing them for primality.
Gilles
I think I understand your method but I'm not sure it would be faster then dividing by 10 at each step. Remember, a comparison of two bigints that occupy k 32-bit word takes k steps.
GregS
@GregS - comparison doesn't take k steps - you are looking to see if a number is less than another so you can stop on the first step where they are different. Obviously subtraction needs k steps but this is only done once per non-zero digit.
Dipstick
+4  A: 

First of all, you can't do I/O in a sensible way without the basic operations(e.g. division and modulus). To provide efficient implementation of converting the bigint to base-10 string, I am researching two possible optimizations:

First, you can divide by some power of ten instead of ten exactly. What that means, you will get four base-10 digits every time you divide the number by 10000 for example.

Second, how would you choose which power of ten to divide by? 10, 100, 1000, 10000, etc...
There seems to be a good choice which is the maximum power of ten that can fit in your word(32-bit). Fortunately, you can implement division/modulus by one word much more efficiently than when it comes to two "bigint"s.

I haven't given an implementation because I am still researching the problem in my spare time because I have implemented the basic operations in my library and I/O is the next step hopefully ;)

AraK
+1, If I recall my Knuth correctly, this is the recommended method.
GregS
+1 and accepted. This is pretty much what I'm looking for. I know IO is for later, but it is a problem that needs to be solved down the line, so I figured I may as well ask.
ZachS
A: 

Dividing by the largest power of 10 that will fit in your base type is the best way to start. In your case, this would be dividing by 10^9. This code should be general purpose since you will be able to reuse it for part of your general division/modulo code.

The running time will be O(n^2) (i.e. if your number is twice as big, the conversion will talk four times longer) but it should be fast enough for moderate sized numbers.

For very large values, you will want to cache large powers of 10, say 10^1000, 10^2000, 10^4000, 10^8000, ...., and then divide by the power of 10 that is greater than or equal to 1/2 of the number you are trying to convert. Repeat this process until the numbers are small enough to convert quickly using division by 10^9. Depending on how efficient your division algorithm is, this approach may not be faster until you encounter numbers in excess of a million digits or more.

If you are writing an interactive calculator where every number will be displayed, then using base 10^9 will be faster for display (it will be O(n), i.e. if your number is twice as big, the conversion will only take twice as long).

casevh