ansaurus

Question

Answer 1

+2 A:

Take a look at Decimal from the stdlib.

from decimal import Decimal, getcontext

getcontext().prec = 320

Decimal(1) / Decimal(7)

I am not posting the results here as it is quite long.

ikanobori 2010-09-13 21:35:55

Answer 2

+6 A:

Floating point numbers don't have infinite precision, which is why you saw the numbers turn to 0. Could you multiply all the probabilities by a large scalar, so that your numbers stay in a higher range? If you're only worried about max and not magnitude, you don't even need to bother dividing through at the end. Alternatively you could use an infinite precision decimal, like ikanobori suggests.

I82Much 2010-09-13 21:39:09

Answer 3

+14 A:

Would it be possible to do your work in a logarithmic space? (For example, instead of storing 1e-320, just store -320, and use addition instead of multiplication)

recursive 2010-09-13 21:43:01

Hey! Your solution seems great. It's very straightforward and seems quite easy to try. Thanks! I will try it.

Pavel 2010-09-14 02:11:21

Answer 4

+6 A:

What you describe is a standard problem with the naive Bayes classifier. You can search for underflow with that to find the answer. or see here.

The short answer is it is standard to express all that in terms of logarithms. So rather than multiplying probabilities, you sum their logarithms.

You might want to look at other algorithms as well for classification.

Muhammad Alkarouri 2010-09-13 21:50:57

erm, you sum their logarithms, not their algorithms.

Adriano Varoli Piazza 2010-09-13 21:53:40

@Adriano: oops! @recursive: thanks!

Muhammad Alkarouri 2010-09-13 22:00:32

Hey! thanks a lot for the answer, I will look into that, as it address my problem exactly. I was thinking that it should be common since I am multiplying probabilities like 3.14e-05 multiple times, so they reach e-300 levels (for example) pretty fast, even more when I have a lot of features in my classifier.

Pavel 2010-09-14 02:03:41

Yeah as recursive also mentioned, this is tackled by using the logarithms and adding the probabilities. In the link provided by Muhammad it's all explained. Thanks everyone for your answers!

Pavel 2010-09-14 02:31:49

ansaurus

tags:

views:

answers:

In Python small floats tending to zero

related questions