ansaurus

Question

Arithmetic precision with doubles in Matlab

Answer 1

+5 A:

64-bit IEEE-754 floating point numbers have enough precision (with a 53 bit mantissa) to represent about 16 significant decimal digits. But it requires more like 45 significant decimal digits to tell the difference between (1+a) = 1.00....000122 and 1.000 for your example.

Jim Lewis 2010-09-13 00:52:32

I think the mantissa has 52 bits (11 bits for exponent, plus 1 bit for the sign making a total of 64-bit). An excellent article by *Cleve Moler* (author of the 1st version of MATLAB) explains all the details of floating point numbers: [PDF link] http://www.mathworks.com/company/newsletters/news_notes/pdf/Fall96Cleve.pdf

Amro 2010-09-13 01:25:16

@Amro: There's an implied leading "1" bit, unless the number is denormalized. So Jim's right, in most cases (and certainly for these numbers).

Drew Hall 2010-09-13 01:51:42

Thank you. I still have a ways to go in understanding number representations in computers.

Planeman 2010-09-13 02:00:48

@Drew Hall: you're absolutely right, the normalized representation has the form `±(1+f)*2^e`... my bad :)

Amro 2010-09-13 02:12:26

@Amro--easy to forget--after all, it's not really there! :) (Plus, in my answer, I was remembering it as 48 bits for some reason. The old brain's not what it used to be...:))

Drew Hall 2010-09-13 02:25:14

Answer 2

+4 A:

"Floating" point means just that--the precision is relative to the scale of the number itself.

In the specific example you gave, 1.22e-45 can be represented alone because the exponent can be adjusted to represent 10^-45, or approximately 2^-150.

On the other hand, 1.0 is represented in binary with scale 2^0 (i.e., 1).

To add these two values, you need to align their decimal points (er...binary points), meaning that all of the precision of 1.22e-45 is shifted 150-odd bits to the right.

Of course, IEEE double precision floating point values only have 53 bits of mantissa (precision), meaning that at the scale of 1.0, 1.22e-45 is effectively zero.

Drew Hall 2010-09-13 00:54:30

Thank you very much!!

Planeman 2010-09-13 01:59:37

Answer 3

+2 A:

To add to what the other answers have said, you can use the MATLAB function EPS to visualize the precision issue you are running into. For a given double-precision floating-point number, the function EPS will tell you the distance from it to the next largest representable floating point number:

>> a = 1.22e-45;
>> b = 1;
>> eps(b)

ans =

  2.2204e-016

Note that the next floating point number that is larger than 1 is 1.00000000000000022204..., and the value of a doesn't even come close to half the distance between the two numbers. Hence a+b ends up staying 1.

Incidentally, you can also see why a is considered non-zero even though it is so small by looking at the smallest representable double-precision floating-point value using the function REALMIN:

>> realmin

ans =

  2.2251e-308  %# MUCH smaller than a!

gnovice 2010-09-13 14:16:50

ansaurus

tags:

views:

answers:

Arithmetic precision with doubles in Matlab

related questions