views:

18067

answers:

12

What is the best possible way to check if a string can be represented as a number in Python?

The function I currently have right now is:

def is_number(s):
    try:
        float(s)
        return True
    except ValueError:
        return False

This seems clunky, but I haven't found a better method because calling float in the main function is even worse.

A: 

Using the int() method would be about the same as that I suppose. Otherwise I'd guess you'd have to loop through the characters of the string, and that is most likely slower

Sir Oddfellow
The Int method misses out on everything with a period or comma, which are still valid numbers! :)but yeah.
Daniel Goldberg
+40  A: 

Ah, here we are. Use the isdigit() function for string objects.

a = "03523"
a.isdigit()

True

b = "963spam"
b.isdigit()

False

String methods

There's also something on Unicode strings, which I'm not too familiar with Unicode - Is decimal/decimal

Sir Oddfellow
However, this won't work for Hexadecimal.
Nico
Nor does it work for numbers with decimal places like 1.2
Daniel Goldberg
That's a negative on negatives as well
intrepion
+31  A: 

"Which, not only is ugly and slow"

I'd dispute both.

A regex or other string parsing would be uglier and slower.

I'm not sure that anything much could be faster than the above. It calls the function and returns. Try/Catch doesn't introduce a much overhead because the most common exception is caught without an extensive search of stack frames.

The issue is that any numeric conversion function has two kinds of results

  • A number, if the number is valid
  • A status code (e.g., via errno) or exception to show that no valid number could be parsed.

C (as an example) hacks around this a number of ways. Python lays it out clearly and explicitly.

I think your code for doing this is perfect.

S.Lott
+5  A: 

Casting to float and catching ValueError is probably the fastest way, since float() is specifically meant for just that. Anything else that requires string parsing (regex, etc) will likely be slower due to the fact that it's not tuned for this operation. My $0.02.

codelogic
Your "2e-2" dollars are a float too (an additional argument for using float :)
ΤΖΩΤΖΙΟΥ
A: 

If you want to know if the entire string can be represented as a number you'll want to use a regexp (or maybe convert the float back to a string and compare it to the source string, but I'm guessing that's not very fast).

m_eiman
+2  A: 

Your code looks fine to me.

Perhaps you think the code is "clunky" because of using exceptions? Note that Python programmers tend to use exceptions liberally when it improves code readability, thanks to its low performance penalty.

Dubhead
+6  A: 

Which, not only is ugly and slow, seems clunky.

It may take some getting used to, but this is the pythonic way of doing it. As has been already pointed out, the alternatives are worse. But there is one other advantage of doing things this way: polymorphism.

The central idea behind duck typing is that "if it walks and talks like a duck, then it's a duck." What if you decide that you need to subclass string so that you can change how you determine if something can be converted into a float? Or what if you decide to test some other object entirely? You can do these things without having to change the above code.

Other languages solve these problems by using interfaces. I'll save the analysis of which solution is better for another thread. The point, though, is that python is decidedly on the duck typing side of the equation, and you're probably going to have to get used to syntax like this if you plan on doing much programming in Python (but that doesn't mean you have to like it of course).

One other thing you might want to take into consideration: Python is pretty fast in throwing and catching exceptions compared to a lot of other languages (30x faster than .Net for instance). Heck, the language itself even throws exceptions to communicate non-exceptional, normal program conditions (every time you use a for loop). Thus, I wouldn't worry too much about the performance aspects of this code until you notice a significant problem.

Jason Baker
A: 

Here's my simple way of doing it. Let's say that I'm looping through some strings and I want to add them to an array if they turn out to be numbers.

try:
    myvar.append( float(string_to_check) )
except:
    continue

Replace the myvar.apppend with whatever operation you want to do with the string if it turns out to be a number. The idea is to try to use a float() operation and use the returned error to determine whether or not the string is a number.

+2  A: 

Is some rare cases you might also need to check for complex numbers (e.g. 1+1j), which can not be represented by a float:

def is_number(s):
    try:
        float(s) # for int, long and float
    except ValueError:
        try:
            complex(s) # for complex
        except ValueError:
            return False

    return True
Matthew Wilcoxson
+1  A: 

There is one exception that you may want to take into account: the string 'NaN'

If you want is_number to return FALSE for 'NaN' this code will not work as Python converts it to its representation of a number that is not a number (talk about identity issues):

>>> float('NaN')
nan

Otherwise, I should actually thank you for the piece of code I now use extensively. :)

G.

Guilherme Rocha
A: 
def isNumber(token):
    for char in token:
        if not char in string.digits: return False
    return True
ThePlaz
BZZZZT. Fails on "1.23" and "1e-4"
John Machin
+1  A: 

I did some speed test. Lets say that if the string is likely to be a number the try/except strategy is the fastest possible.If the string is not likely to be a number and you are interested in Integer check, it worths to do some test (isdigit plus heading '-'). If you are interested to check float number, you have to use the try/except code whitout escape.

FxIII
Mind posting the results?
Daniel Goldberg