views:

574

answers:

13

The problem is to derive a formula for determining number of digits a given decimal number could have in a given base.

For example: The decimal number 100006 can be represented by 17,11,9,8,7,6,8 digits in bases 2,3,4,5,6,7,8 respectively.

Well the formula I derived so far is like this : (log10(num) /log10(base)) + 1.

in C/C++ I used this formula to compute the above given results.

long long int size = ((double)log10(num) / (double)log10(base)) + 1.0;

But sadly the formula is not giving correct answer is some cases,like these :

Number 8 in  base 2 : 1,0,0,0
Number of digits: 4
Formula returned: 3

Number 64 in  base 2 : 1,0,0,0,0,0,0
Number of digits: 7
Formula returned: 6

Number 64 in  base 4 : 1,0,0,0
Number of digits: 4
Formula returned: 3

Number 125 in  base 5 : 1,0,0,0
Number of digits: 4
Formula returned: 3

Number 128 in  base 2 : 1,0,0,0,0,0,0,0
Number of digits: 8
Formula returned: 7

Number 216 in  base 6 : 1,0,0,0
Number of digits: 4
Formula returned: 3

Number 243 in  base 3 : 1,0,0,0,0,0
Number of digits: 6
Formula returned: 5

Number 343 in  base 7 : 1,0,0,0
Number of digits: 4
Formula returned: 3

So the error is by 1 digit.I just want somebody to help me to correct the formula so that it work for every possible cases.

Edit : As per the input specification I have to deal with cases like 10000000000, i.e 10^10,I don't think log10() in either C/C++ can handle such cases ? So any other procedure/formula for this problem will be highly appreciated.

+5  A: 

Number of digits of a numeral in a given base

Daniel Daranas
+1 for the link.
nthrgeek
+3  A: 

Since your formula is correct (I just tried it), I would think that it's a rounding error in your division, causing the number to be just slightly less than the integer value it should be. So when you truncate to an integer, you lose 1. Try adding an additional 0.5 to your final value (so that truncating is actually a round operation).

Nick Lewis
Do you mean this : `size = ((((double)log10(num) / (double)log10(base))) + 1.0)+ 0.5;` ? well then it will not work.
nthrgeek
If you would just add another 0.5, you'll get different answers in other border cases:Number 15 in base 2 : 1,1,1,1(log10(15) / log10(2)) + 1.5 = 5.407.., so the answer would be 5, not 4.
catchmeifyoutry
@kigurai : `size = ceil((log10(num) / log10(base)) + 1.0);` won't work !!
nthrgeek
@catchmeifyoutry: Then that implies his formula is actually incorrect. Oops.
Nick Lewis
The correct formula is `floor(log10(num)/log10(base)+1.0)`. However, with rounding errors in your initial division, that formula doesn't necessarily work in practice. So you need to round if and only if you're within a certain epsilon of an integer, and then floor, as some divisions will correctly leave you with a non-integer answer.
Nick Lewis
@kigurai: `ceil((log(num)/log(base))+1)` doesn't work on number 15 in base 2, as catchmeifyoutry pointed out. The floor is correct (and is given implicitly in the truncation to integer), but there is a rounding error. However, it is not safe to ALWAYS round. Only round within a safe epsilon, though I don't know what that would be here.
Nick Lewis
@debanjan, yes I noticed that it does not work for n=7 and base=2.
kigurai
A: 

It may be beneficial to wrap a rounding function (e.g. + 0.5) into your code somewhere: it's quite likely that the division is producing (e.g.) 2.99989787, to which 1.0 is added, giving 3.99989787 and when that's converted to an int, it gives 3.

Al
+7  A: 

Either of the following will work:

>>> from math import *
>>> def digits(n, b=10):
...     return int(1 + floor(log(n, b))) if n else 1
...
>>> def digits(n, b=10):
...     return int(ceil(log(n + 1, b))) if n else 1
...

The first version is explained at mathpath.org. In the second version the + 1 is necessary to yield the correct answer for any number n that is the smallest number with d digits in base b. That is, those numbers which are written 10...0 in base b. Observe that input 0 must be treated as a special case.

Decimal examples:

>>> digits(1)
1
>>> digits(9)
1
>>> digits(10)
2
>>> digits(99)
2
>>> digits(100)
3

Binary:

>>> digits(1, 2)
1
>>> digits(2, 2)
2
>>> digits(3, 2)
2
>>> digits(4, 2)
3
>>> digits(1027, 2)
11

Edit: The OP states that the log solution may not work for large inputs. I don't know about that, but if so, the following code should not break down, because it uses integer arithmetic only (this time in C):

unsigned int 
digits(unsigned long long n, unsigned long long b)
{
  unsigned int d = 0;
  while (d++, n /= b);
  return d;
}

This code will probably be less efficient. And yes, it was written for maximum obscurity points. It simply uses the observation that every number has at least one digit, and that every divison by b which does not yield 0 implies the existence of an additional digit. A more readable version is the following:

unsigned int 
digits(unsigned long long n, unsigned long long b)
{
  unsigned int d = 1;
  while (n /= b) {
    d++;
  }
  return d;
}
Stephan202
This fails at larger numbers-- digits(1027, 2) for me, but it's probably implementation-dependent.
Beta
@Beta: interesting. Here `digits(1027, 2)` yields `11`, which is correct (can be tested with e.g. `len('{0:b}'.format(1027))`).
Stephan202
What is the precision of your log function?
Beta
@Beta: in Python `log` appears to accept arbitrarily large numbers (well, I just punched in a lot of digits and it kept working).
Stephan202
But how many digits does it return? I don't know a thing about Python, but I'll just bet we can find two numbers, b^p and (b^p)-1, that have the same log as far as Python is concerned.
Beta
@Beta: I cannot answer your question, but if you are right that such numbers exist, then the second solution is clearly to be preferred (since it relies solely on integer arithmetic).
Stephan202
Python automatically switches to arbitrary-precision arithmetic; `log` will always be correct. But `floor` and `ceil` are not: on my computer at least, printing out digits(n) for the first 20 powers of 10 shows mistakes. (Two 3s in the first version and two 15s in the second.) For the C version: why the trickery in the for loop with comma-separated statements and an empty body? You could move one of them out and improve clarity. Or, if clarity is not the goal, it would be shorter to write: "while(n/=b) d++;" :-)
ShreevatsaR
I haven't dissected your second solution entirely but just a rough look, seems to me that you are just computing the entire base representation of the number,that's isn't much efficient :)
nthrgeek
@ShreevatsaR: thanks, updated the answer with more obscurity ;). @Debanjan: you are right. When in doubt, just time both solutions. Regardless, you may have to choose between speed and accuracy...
Stephan202
A: 

Looks like the formula is right to me:

Number 8 in  base 2 : 1,0,0,0
Number of digits: 4
Formula returned: 3

log10(8) = 0.903089
log10(2) = 0.301029

Division => 3

+1 => 4

So it's definitely just a rounding error.

jvenema
+2  A: 

What you want is ceiling ( = smallest integer not greater than) logb (n+1), rather than what you're calculating right now, floor(1+logb(n)).

You might try:

int digits = (int) ceil( log((double)(n+1)) / log((double)base) );
Managu
Of course, this breaks down if n<=1. Presumably you could handle n=0,1 as special cases, if you wanted to be thorough.
Managu
This is not enough, as it will still yield `2` for `n = 100` and `base = 10`.
Stephan202
Very good. Fixed it. Still breaks if n=0, though.
Managu
It does **not work** if there are fast floating operations in the setting of the compiler and n=7, base=2. `log10(8)/log10(2)` is about `3` (`2.9999` or `3.000001`). and ceil(3.00001) may be 4 sometimes.
Alexey Malistov
+1  A: 

Using your formula,

log(8)/log(2) + 1 = 4

the problem is in the precision of the logarithm calculation. Using

ceil(log(n+1)/log(b))

ought to resolve that problem. This isn't quite the same as

ceil(log(n)/log(b))

because this gives the answer 3 for n=8 b=2, nor is it the same as

log(n+1)/log(b) + 1

because this gives the answer 4 for n=7 b=2 (when calculated to full precision).

I actually get some curious resulting implementing and compiling the first form with g++:

double n = double(atoi(argv[1]));
double b = double(atoi(argv[2]));
int i = int(std::log(n)/std::log(b) + 1.0);

fails (IE gives the answer 3), while,

double v = std::log(n)/std::log(b) + 1.0;
int i = int(v);

succeeds (gives the answer 4). Looking at it some more I think a third form

ceil(log(n+0.5)/log(b))

would be more stable, because it avoids the "critical" case when n (or n+1 for the second form) is an integer power of b (for integer values of n).

Adam Bowen
What's the precision of your log function?
Beta
I was using a JavaScript calculator which gives the right answer for the first form, I assume that the reason sixlettervariables gets the wrong answer is because of a problem representing the final results. In particular log(8)/log(2) is exactly 3 (2^3 = 8), so log(8)/log(2)+1 has the expected value of 4.
Adam Bowen
A: 

Floating point rounding issues.

log10(216) / log10(6) =  2.9999999999999996

But you cannot add 0.5 as suggested, because it would not work for the following

log10(1295) = log10(6) = 3.9995691928566091   //    5, 5, 5, 5
log10(1296) = log10(6) = 4.0                  // 1, 0, 0, 0, 0

Maybe using the log(value, base) function would avoid these rounding errors.

Didier Trosset
+7  A: 

There are fast floating operations in your compiler settings. You need precise floation operations. The thing is that log10(8)/log10(2) is always 3 in math. But may be your result is 2.99999, for expample. It is bad. You must add small additive, but not 0.5. It should be about .00001 or something like that.

Almost true formula:

int size = static_cast<int>((log10((double)num) / log10((double)base)) + 1.00000001);

Really true solution

You should check the result of your formula. Compexity is O(log log n) or O(log result)!

int fast_power(int base, int s)
{
    int res = 1;
    while (s) {
     if (s%2) {
      res*=base;
      s--;
     } else {
      s/=2;
      base*=base;
     }
    }
    return res;
}

int digits_size(int n, int base)
{
    int s = int(log10(1.0*n)/log10(1.0*base)) + 1;
    return fast_power(base, s) > n ? s : s+1;
}

This check is better than Brute-force test with base multiplications.

Alexey Malistov
Thanks, but what if num = 10000000000l ? I think log10() can't handle such cases, any other solution ?
nthrgeek
Added new solution
Alexey Malistov
+1,I haven't checked your solution using a compiler but I think it will work fine.I liked your approach.Although I am aware of the exponentiation at O(logn) but not aware of it's use here, So thanks :)
nthrgeek
I am accepting your solution as it solved both problems, before edit and after edit.
nthrgeek
+1  A: 
Beta
A: 

I think that the only way to get the rounding error eliminated without producing other errors is to use or implement integer logarithms.

Svante
A: 

Here is a solution in bash:

% digits() { echo $1 $2  opq | dc | sed 's/ .//g;s/.//' | wc -c; }


% digits 10000000000 42
7
mouviciel
A: 
static int numInBase(int num, int theBase)
{
   if(num == 0) return 0;
   if (num == theBase) return 1;
   return 1 + numInBase(num/theBase,theBase);
}
Lumpy