views:

672

answers:

9

I have been asked to test a library provided by a 3rd party. The library is known to be accurate to n significant figures. Any less significant errors can safely be ignored. I want to write a function to help me compare the results:

def nearlyequal( a, b, sigfig=5 ):

The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal. The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written decimal.

Can anybody suggest a good implementation, I've written a mini unit-test. Unless you can see a bug in my tests then a good implementation should be pass the following:

assert nearlyequal(1, 1, 5) 
assert nearlyequal(1.0, 1.0, 5) 
assert nearlyequal(1.0, 1.0, 5) 
assert nearlyequal(-1e-9, 1e-9, 5) 
assert nearlyequal(1e9, 1e9 + 1 , 5) 
assert not nearlyequal( 1e4, 1e4 + 1, 5) 
assert nearlyequal( 0.0, 1e-15, 5 ) 
assert not nearlyequal( 0.0, 1e-4, 6 )

Additional notes:

  1. Values a and b might be of type int, float or numpy.float64. Values a and b will always be of the same type. It's vital that conversion does not introduce additional error into the function.

  2. Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal. This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.

  3. Speed... I've got to compare a lot of numbers so the faster the better.

  4. I've got numpy, scypy and the standard-library. Anything else will be hard for me to get, especially for such a small part of the project.

Thanks!

+4  A: 

Here's a take.

def nearly_equal(a,b,sig_fig=5):
    return ( a==b or 
             int(a*10**sig_fig) == int(b*10**sig_fig)
           )
Triptych
+1: Beat me by 30 seconds!
S.Lott
Haha @S.Lott - I've definitely been on the other end of that.
Triptych
you don't really need that if statement. you can just return result of comparison: it's either True or False, it's shorter and surely more pythonic
SilentGhost
will this work for the _nearlyequal((1e9,1+1e9, 5)_?
nlucaroni
good point @silentghost. Made the change.
Triptych
This fails nearlyequal(1e9, 1e9 + 1 , 5) - it does not work for big numbers.
Salim Fadhley
The issue is that "significant figures" depends on whether the number is >1 or <1. >1 you have to use sig_fig-math.log10(n) to shift "to the right". If <1 you simple use sig_fig to shift "to the left".
S.Lott
For the purposes of the problem, the numbers can be any number which can be represented by floating-point.
Salim Fadhley
+3  A: 

"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.

>>> int(3.1415926 * 10**3)
3141
>>> int(1234567 * 10**-3)
1234
>>>
S.Lott
Can you provide an algorithm to do this?
Salim Fadhley
Different from the code I already posted? What more do you need to know?
S.Lott
+8  A: 

There is a function assert_approx_equal in numpy.testing (source here) which may be a good starting point.

>>> help(assert_approx_equal) Help on function assert_approx_equal in module numpy.testing.utils:

assert_approx_equal(actual, desired, significant=7, err_msg='', verbose=True)
    Raise an assertion if two items are not
    equal.  I think this should be part of unittest.py
    Approximately equal is defined as the number of significant digits
    correct
dF
Great! This will do exactly what I need. Thank you.
Salim Fadhley
+2  A: 

I believe your question is not enough well defined, and the unit-tests you present prove it:

If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test "assert nearlyequal(1e9, 1e9 + 1 , 5)" should fail because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.

And if y 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test "assert nearlyequal(-1e-9, 1e-9, 5)" should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.

If you meant the first definition, then the first answer on this page (by Triptych) is good. If you meant the second definition, please say it, I promise to think about it :-)

Oren Shemesh
Actually, you are right about "assert nearlyequal(-1e-9, 1e-9, 5)" - it breaks the rules! +1
Salim Fadhley
+1  A: 

I would actually say:

def nearlyequal( a, b, sigfig=5 ):
    return round(a, sigfig) == round(b, sigfig)

as the methods above using int() always round down (ie. int(0.9) goes to 0).

EDIT: This is horribly wrong. I just wanted to make the point that int rounds down.

Mike Boers
Totaly wrong, but +1 for being honest!
Salim Fadhley
+1  A: 

This is a fairly common issue with floating point numbers. I solve it based on the discussion in Section 1.5 of Demmel[1]. (1) Calculate the roundoff error. (2) Check that the roundoff error is less than some epsilon. I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.

Step 1. Roundoff error

def roundoff_error(exact, approximate):
    return abs(approximate/exact - 1.0)

Step 2. Floating point equality

def float_equal(float1, float2, epsilon=2.0e-9):
    return (roundoff_error(float1, float2) < epsilon)

There are a couple obvious deficiencies with this code.

  1. Division by zero error if the exact value is Zero.
  2. Does not verify that the arguments are floating point values.

Revision 1.

def roundoff_error(exact, approximate):
    if (exact == 0.0 or approximate == 0.0):
        return abs(exact + approximate)
    else:
        return abs(approximate/exact - 1.0)

def float_equal(float1, float2, epsilon=2.0e-9):
    if not isinstance(float1,float):
        raise TypeError,"First argument is not a float."
    elif not isinstance(float2,float):
        raise TypeError,"Second argument is not a float."
    else:
        return (roundoff_error(float1, float2) < epsilon)

That's a little better. If either the exact or the approximate value is zero, than the error is equal to the value of the other. If something besides a floating point value is provided, a TypeError is raised.

At this point, the only difficult thing is setting the correct value for epsilon. I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon. But the correct value depends on both your application and your algorithm.

[1] James W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997.

tmh
+1  A: 

Oren Shemesh got part of the problem with the problem as stated but there's more:

assert nearlyequal( 0.0, 1e-15, 5 )

also fails the second definition (and that's the definition I learned in school.)

No matter how many digits you are looking at, 0 will not equal a not-zero. This could prove to be a headache for such tests if you have a case whose correct answer is zero.

Loren Pechtel
+1  A: 

There are already plenty of great answers, but here's a think:

def closeness(a, b):
  """Returns measure of equality (for two floats), in unit
     of decimal significant figures."""
  if a == b:
    return float("infinity")
  difference = abs(a - b)
  avg = (a + b)/2
  return math.log10( avg / difference )


if closeness(1000, 1000.1) > 3:
  print "Joy!"
TokenMacGuy
A: 

There is a interesting solution to this by B. Dawson (with C++ code) at "Comparing Floating Point Numbers". His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.

Dan M.