views:

827

answers:

9

Today, I came across quite strange problem. I needed to calculate string length of a number, so I came up with this solution

// say the number is 1000
(int)(log(1000)/log(10)) + 1

This is based on mathematical formula

log10x = lognx/logn10 (explained here)

But I found out, that in C,

(int)(log(1000)/log(10)) + 1

is NOT equal to

(int) log10(1000) + 1

but it should be.

I even tried the same thing in Java with this code

(int) (Math.log(1000) / Math.log(10)) + 1
(int) Math.log10(1000) + 1

but it behave the same wrong way.

The story continues. After executing this code

for (int i = 10; i < 10000000; i *= 10) {
   System.out.println(((int) (Math.log10(i)) + 1) + 
                " " + ((int) (Math.log(i) / Math.log(10)) + 1));
}

I get

2 2
3 3
4 3  // here second method produces wrong result for 1000
5 5
6 6
7 6  // here again

So the bug seems to occur on every multiple of 1000.

I showed this to my C teacher, and he said that it might be caused by some type conversion error during log division, but he didn't know why.

So my questions are

  • Why isn't (int) (Math.log(1000) / Math.log(10)) + 1 equal to (int) Math.log10(1000) + 1 , while it should be, according to the math.
  • Why is it wrong only for multiples of 1000?

edit: It is not rounding error, because

Math.floor(Math.log10(i)) + 1
Math.floor(Math.log(i) / Math.log(10)) + 1

produce same, wrong output

2 2
3 3
4 3
5 5
6 6
7 6

edit2: I have to round down, because I want to know the number of digits.

log10(999) + 1 = 3.9995654882259823
log10(1000) + 1 =  4.0

If I just round, I get same result (4), which is wrong for 999, because it has 3 digits.

+2  A: 

Updated: it's due to precision and rounding errors

Mitch Wheat
It does not matter, since it uses formula to convert from one base to another
Darth
@Darth: thx I must be tried...
Mitch Wheat
I mean tired! ...
Mitch Wheat
The offence is not serious enough to warrant a trial...
Artelius
A: 

if you want to have your result as an integer, you should probably round and not just cut off the part after point.

You are probably getting something like 6.999999 and round it down to 6.

tliff
I actually cannot round, because when you have log10(100) and log10(99), the seond one is slightly below 2, so when I would round, it would give me same result as for 100, which is wrong .. I could use floor() function to round down though
Darth
A: 

this sounds like rounding errors

knittl
A: 

With (int) casting, you're cutting off the necessary decimal part. Try printing them as doubles without casting (why are you casting anyway?), and you'll be fine.

Michael Foukarakis
see last edit for explanation
Darth
+22  A: 

You provided the code snippet

for (int i = 10; i < 10000000; i *= 10) {
   System.out.println(((int) (Math.log10(i)) + 1) + 
                " " + ((int) (Math.log(i) / Math.log(10)) + 1));
}

to illustrate your question. Just remove the casts to int and run the loop again. You will receive

2.0 2.0
3.0 3.0
4.0 3.9999999999999996
5.0 5.0
6.0 6.0
7.0 6.999999999999999

which immediately answers your question. As tliff already argued, the casts cut off the decimals instead of rounding properly.

EDIT: You updated your question to use floor(), but like casting floor() will round down and therefore drop the decimals!

janko
If I don't cast down, and execute this Math.log10(9) + 1), it gets me something like 1.954, which is 1, because 2 would be for Math.log10(10) + 1
Darth
see last edit for better explanation
Darth
gs: Err, no, 4 has an exact floating point representation. The error arises from inexact values prior to the division.
caf
A: 

Print out the intermediate results, i.e. log(1000), log(10), log(1000)/log(10) and log10(1000). This should give better hints than guessing.

Secure
+5  A: 

This is due to precision and rounding issues. Math.log(1000) / Math.log(10) is not precisely equal to 3.

If you need exact precision, don't use floating point arithmetic - and give up on logarithms in general. Floating point numbers are inherently fuzzy. For a precise result, use integer arithmetic.

I really suggest you don't go down this path in general, but it sounds like you're taking the logarithm of whole numbers to determine some order of magnitude. If that's the case, then (int)(Math.log(x+0.5) / Math.log(10)) will be more stable - but realize that double's have only 53 bits of precision, so around 10 the 15th doubles can no longer represent integers exactly, and this trick won't work then.

Eamon Nerbonne
+6  A: 

The log operation is a Transcendental Function. The best a computer can do to evaluate the result is to use an Algebraic Function which approximates the required operation. The accuracy of the result is dependent on the algorithm the computer uses (this could be the microcode in the FPU). On the Intel FPU, there are settings the affect the precision of the various transcendental functions (trig functions are also transcendental) and the FPU specifications will detail the level of accuracy of the various algorithms used.

So, in addition to the rounding errors mentioned above, there is also an accuracy issue as well since computed log(x) may not be equal to actual log(x).

Skizz

Skizz
+4  A: 

Add a very small value to the numerator to bypass the accuracy issue pointed by Skizz.

// say the number is 1000
(int)((log(1000)+1E-14)/log(10)) + 1

1E-14 should be enough to nudge the accuracy back in track.

changed the small value from 1E-15, which would give incorrect results for some inputs

I tested with 1E-14 for a random sample of unsigned long longs and all my numbers passed.

pmg
1E-15 is too small, doesnt work for "1000"... better use 0.5 as suggested above.
Carlos Heuberger
You're right ... but 0.5 suggests you're trying to solve a rounding issue. 1E-14 (with IEEE 64-bit doubles) works for 1000 and, at least, for the few values I tested between 0 and 2^64-1.
pmg