ansaurus

Question

Checking if a string can be converted to float in Python

Answer 1

+15 A:

I would just use..

try:
    float(element)
except ValueError:
    print "Not a float"

..it's simple, and it works

Another option would be a regular expression:

import re
if re.match("^\d+?\.\d+?$", element) is None:
    print "Not float"

dbr 2009-04-09 21:55:39

+1 for option 1: much, much faster than fooling around with a regex (if most strings turn out to be floats).

S.Lott 2009-04-09 22:03:11

+1 for option 1, too.

lothar 2009-04-09 22:10:43

@S.Lott: Most of the strings this is applied to will turn out to be ints or floats.

Chris Upchurch 2009-04-09 22:15:12

Is there not a tryfloat(element) or otherwise equivalent function like C#'s float.TryParse(element). Typically excepting is not very performant. Not that this is something that will happen very often, but if it's in a tight loop, it could be an issue.

dustyburwell 2009-04-09 22:18:55

Otherwise, +1 for option #1 over option #2 since option #2 will not catch entries that overflow the storage of int or float

dustyburwell 2009-04-09 22:20:37

Your regex is not optimal. "^\d+\.\d+$" will fail a match at the same speed as above, but will succeed faster. Also, a more correct way would be: "^[+-]?\d(>?\.\d+)?$" However, that still doesn't match numbers like: +1.0e-10

John Gietzen 2009-04-09 22:25:46

@ascalonx: int() won't overflow in Python. If a number is too large to get stored as a regular int it will use the long integer representation which can grow until you hit your virtual memory limit.

Chris Upchurch 2009-04-09 22:29:14

Except that you forgot to name your function "will_it_float".

bvmou 2009-04-10 01:07:20

@ascalonx: "Typically excepting is not very performant" False in Python. Exceptions in Python are often the cheapest way to do things -- try it and see.

S.Lott 2009-04-10 11:19:26

Not only is the first one simple, but it's the Pythonic way of doing such a thing

Jeremy Cantrell 2009-04-10 16:58:21

Answer 2

+2 A:

This regex will check for scientific floating point numbers:

^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$

However, I believe that your best bet is to use the parser in a try.

John Gietzen 2009-04-09 22:30:52

Answer 3

+5 A:

If you cared about performance (and I'm not suggesting you should), the try-based approach is the clear winner (compared with your partition-based approach or the regexp approach), as long as you don't expect a lot of invalid strings, in which case it's potentially slower (presumably due to the cost of exception handling).

Again, I'm not suggesting you care about performance, just giving you the data in case you're doing this 10 billion times a second, or something. Also, the partition-based code doesn't handle at least one valid string.

$ ./floatstr.py
F..
partition sad: 3.1102449894
partition happy: 2.09208488464
..
re sad: 7.76906108856
re happy: 7.09421992302
..
try sad: 12.1525540352
try happy: 1.44165301323
.
======================================================================
FAIL: test_partition (__main__.ConvertTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "./floatstr.py", line 48, in test_partition
    self.failUnless(is_float_partition("20e2"))
AssertionError

----------------------------------------------------------------------
Ran 8 tests in 33.670s

FAILED (failures=1)

Here's the code (Python 2.6, regexp taken from John Gietzen's answer):

def is_float_try(str):
    try:
        float(str)
        return True
    except ValueError:
        return False

import re
_float_regexp = re.compile(r"^[-+]?(?:\b[0-9]+(?:\.[0-9]*)?|\.[0-9]+\b)(?:[eE][-+]?[0-9]+\b)?$")
def is_float_re(str):
    return re.match(_float_regexp, str)


def is_float_partition(element):
    partition=element.partition('.')
    if (partition[0].isdigit() and partition[1]=='.' and partition[2].isdigit()) or (partition[0]=='' and partition[1]=='.' and pa\
rtition[2].isdigit()) or (partition[0].isdigit() and partition[1]=='.' and partition[2]==''):
        return True

if __name__ == '__main__':
    import unittest
    import timeit

    class ConvertTests(unittest.TestCase):
        def test_re(self):
            self.failUnless(is_float_re("20e2"))

        def test_try(self):
            self.failUnless(is_float_try("20e2"))

        def test_re_perf(self):
            print
            print 're sad:', timeit.Timer('floatstr.is_float_re("12.2x")', "import floatstr").timeit()
            print 're happy:', timeit.Timer('floatstr.is_float_re("12.2")', "import floatstr").timeit()

        def test_try_perf(self):
            print
            print 'try sad:', timeit.Timer('floatstr.is_float_try("12.2x")', "import floatstr").timeit()
            print 'try happy:', timeit.Timer('floatstr.is_float_try("12.2")', "import floatstr").timeit()

        def test_partition_perf(self):
            print
            print 'partition sad:', timeit.Timer('floatstr.is_float_partition("12.2x")', "import floatstr").timeit()
            print 'partition happy:', timeit.Timer('floatstr.is_float_partition("12.2")', "import floatstr").timeit()

        def test_partition(self):
            self.failUnless(is_float_partition("20e2"))

        def test_partition2(self):
            self.failUnless(is_float_partition(".2"))

        def test_partition3(self):
            self.failIf(is_float_partition("1234x.2"))

    unittest.main()

Jacob Gabrielson 2009-04-09 22:56:46

+1: Perfect. As we all know, regex is not REALLY a parsing engine, but a searching engine.

John Gietzen 2009-04-10 13:46:25

Answer 4

+1 A:

Strictly, these regexp-style solutions should be checking the locale. Not all locales use dots as the separator.

2009-04-10 21:34:47

ansaurus

tags:

views:

answers:

Checking if a string can be converted to float in Python

related questions