tags:

views:

119

answers:

5

Hi,

I'm puzzled by some behaviour I'm seeing when copying a float array member into another variable - please help!

For example

data_entry[1] = 9.6850069951

new_value = data_entry[1]


<comment> #print both

9.6850069951


9.6850663300

I'm aware of the problem of binary storage of floats but I thought with a direct copy of memory we would end up with the same value.

Any ideas? I need better precision than this! thanks in advance Stuart

A: 

Didn't work for me with Python 2.6.2 on Linux:

>>> data_entry = [1, 2]
>>> data_entry[1] = 9.6850069951
>>> new_value = data_entry[1]
>>> print data_entry[1]
--> print(data_entry[1])
9.6850069951
>>> print new_value
--> print(new_value)
9.6850069951

One option would be to switch to using Decimal objects:

>>> from decimal import Decimal
>>> data_entry[1] = Decimal('9.6850069951')
>>> new_value = data_entry[1]
>>> print data_entry[1]
--> print(data_entry[1])
9.6850069951
>>> print new_value
--> print(new_value)
9.6850069951

If you're losing precision somehow this might help.

samtregar
@samtregar How would the use of the 'decimal' module affect memory use? (considering I use very large lists or arrays?)
Morlock
@Morlock, using `decimal.Decimal` is more memory intensive and much slower than using `float`. For the class of problems `float` is good for (representing things like physical measurements), it is almost always the right choice. `decimal.Decimal`'s main use is representing money and performing calculations involving money with the right precision and rounding rules. Like `float`, it has representation and roundoff errors, though the precision can be modified to be extremely high. I have yet to see real software where `Decimal` was chosen over `float` because the latter was not precise enough.
Mike Graham
@samtregar Thanks, very clear.
Morlock
A: 

You've left some code out.

>>> data_entry=[0,0]
>>> data_entry[1] = 9.6850069951
>>> 
>>> new_value = data_entry[1]
>>> print data_entry
[0, 9.6850069951000002]
>>> print new_value
9.6850069951
>>> print data_entry[1]
9.6850069951

The repr and the str of this floating-point number are producing different results. My guess is that the code you posted omitted mentioning this difference.

S.Lott
+4  A: 

After an assignment the variable new_value is not a copy of the float, it's just another reference to the exact same object. Therefore it cannot possibly have a different printed representation. So there's definitely some detail omitted in the original question.

Stuart - can you please try the following and post the result, or tell us how your actual code varies. Note below that new_value is data_entry[1] i.e. they are both the same object.

>> data_entry = [0,0]
>> data_entry[1] = 9.6850069951
>> new_value = data_entry[1]
>> new_value is data_entry[1]
True
>> print data_entry[1], new_value
9.6850069951 9.6850069951
joefis
+3  A: 

If you're really using the array module (or numpy's arrays) the precision loss is easy to explain, e.g.:

>>> dataentry = array.array('f', [9.6850069951])
>>> dataentry[0]
9.6850070953369141

here, the 'f' first arg to array.array says we're using 32-bit floats, so only about 7 significant digits "survive". But it's easy to use 64-bit floats (once upon a time those were known as "double precision"!-):

>>> dataentry = array.array('d', [9.6850069951])
>>> dataentry[0]
9.6850069951000002

As you see, this way more than a dozen significant digits "survive" (you can typically rely on about 14+, unless you do arithmetic "oops"s such as taking the difference of numbers very close to each other, which of course devours your precision;-).

Alex Martelli
Thanks to all for your comments and advice. Using Alex's suggestion, I seem to have solved the problem by using and array.array('d',x) expression so it seems original float array did not have enough precision. I'll posted some more code in thenext comment as there's no space here.
SJA
old_code:data = []for data_entry in data: if (data_entry[1] != 0): value = data_entry[1] modlog(logging.INFO,'raw value = %.12f',data_entry[1]) modlog(logging.INFO,'value_in = %.12f', value)output::INFO:raw value = 2.334650748292:INFO:value_in = 2.334685585881new code:data = array.array('d') if (data[index] != 0): test_data = data[index] modlog(logging.INFO,'raw data = %.12f', data[(index)]) modlog(logging.INFO,'test_data = %.12f', test_data)output::INFO:raw data = 2.333840588874:INFO:test_data= 2.333840588874
SJA
@SJA, code in comments is totally unreadable. Please edit your question instead, to add the code in question, so it can be properly formatted as code.
Alex Martelli
@Alex, I noticed the problem and replied to my own question 2 posts below so I could format the code. I'll know the next time to just edit the original question.
SJA
A: 

Here's some edited code formatted:

old code:
data = []
for data_entry in data:
    if (data_entry[1] != 0):
    value = data_entry[1]
    modlog(logging.INFO,'raw value = %.12f',data_entry[1])
    modlog(logging.INFO,'value_in = %.12f', value)
output:
:INFO:raw value = 2.334650748292
:INFO:value_in  = 2.334685585881

new code:
data = array.array('d') 
if (data[index] != 0):
    test_data = data[index]
    modlog(logging.INFO,'raw data = %.12f', data[(index)])
    modlog(logging.INFO,'test_data = %.12f', test_data)
output:
:INFO:raw data = 2.333840588874
:INFO:test_data= 2.333840588874
SJA