views:

1262

answers:

6

I have a Python function that takes a numeric argument that must be an integer in order for it behave correctly. What is the preferred way of verifying this in Python?

My first reaction is to do something like this:

def isInteger(n):
    return int(n) == n

But I can't help thinking that this is 1) expensive 2) ugly and 3) subject to the tender mercies of machine epsilon.

Does Python provide any native means of type checking variables? Or is this considered to be a violation of the language's dynamically typed design?

EDIT: since a number of people have asked - the application in question works with IPv4 prefixes, sourcing data from flat text files. If any input is parsed into a float, that record should be viewed as malformed and ignored.

+12  A: 
isinstance(n, int)

If you need to know whether it's definitely an actual int and not a subclass of int (generally you shouldn't need to do this):

type(n) is int

this:

return int(n) == n

isn't such a good idea, as cross-type comparisons can be true - notably int(3.0)==3.0

bobince
I found the 'type' function about 5 minutes after posting the question. :)Out of interest, is there any significant performance hit in using 'isInstance' over 'type' ?
Murali Suriar
If anything, I would expect isinstance to be faster. The difference shouldn't be much, though (and if you're really _concerned_ about performance, why are you using Python?)
David Zaslavsky
@David: I'm not concerned about performance for this program specifically; this is the prototype of a hobby project that may or may not be abandoned. That said, I'm interested in the difference between the two approaches, and the trade-offs involved in each. More speed is never a bad thing. :)
Murali Suriar
timeit.Timer() says type-is-int is about 4% faster than isinstance-int on my machine with Python 2.6. Pretty negligable either way, go for the one that says best what you mean.
bobince
-1: typecheck is not a good idea. You should say that on your answer.
nosklo
There are certainly occasions where type checking is appropriate. Without knowing what the OP is up to - only that it "**must**" be an integer - it is too early to condemn.
bobince
@bobince: typechecking is bad at all situations. I can't think of a situation where a typecheck would do good.
nosklo
nosklo: variant args, variant constructors, str/unicode handling, checking exceptions, getitem... type checking is all over Python. Drop the OO dogma and live with it.
bobince
@bobince, type checking shouldn't be necessary. Instead, catch the TypeError to know that it isn't compatible.
Evan Fosmark
Evan: who says there's going to be an exception? It might run, but behave incorrectly. Without more context from the OP, we don't know whether this is or isn't an appropriate place to be type-checking.
bobince
@bobince: Type checking is *never* appropriate. It is stupid. Proof is that you couldn't come up with a situation where it should be used. None of your examples: "variant args, variant constructors, str/unicode handling, checking exceptions, getitem" uses typechecking in any way.
nosklo
All those use typechecking, which is why I mentioned them. You might have an argument if the OP were defining their own value types, but the fact is you can't put your own encapsulated actions inside 'int' and other built-in types. Still, thanks for telling me it is "stupid" - good argument!
bobince
@bobince: they don't use typechecking! variant args don't use typechecking in any way, just *args that returns a tuple. variant constructors also don't, you just make extra classmethods for that. str/unicode handling also... just treat everything as unicode. Exception check is builtin.
nosklo
@bobince: and getitem? I don't even get how that would use typechecking. And the fact that I can't put my actions inside built-in types is the *very reason* typechecking is bad!! Without typechecking I don't need to use the built-in types at all so I can have my actions *and* it still works.
nosklo
__getitem__ can receive a numeric position, arbitrary key, or slice object as its argument. This is an example of variant (not variable-length) arguments. How do you tell which you've been passed? With isinstance. dict can be instantiated from a sequence or a mapping, with different semantics... etc
bobince
...and detecting the difference between str and unicode is vital in a program that doesn't want to end up with random UnicodeDecodeErrors. I'm not arguing that type-checking is great, and sure, it should be avoided when you have control of the interfaces, but completely avoiding it is not practical.
bobince
@bobince On getitem I use try: start, end, step = item.indices() except AttributeError: something(). diff between unicode and str is useless if you decode everything you receive from outside python and reencode all output so you end only with unicode internally (which is what py3.0 str is all about)
nosklo
@bobince If I write my own dict class I prefer using an alternate constructor for mappings but I could try: arg = arg.iteritems() except AttributeError: pass then for key, value in arg
nosklo
@bobince: Point is that your last sentence is False. I can always completely avoid typechecking and be happy. In fact I do that on all programs I wrote.
nosklo
A: 
if type(n) is int

This checks if n is a Python int, and only an int. It won't accept subclasses of int.

Type-checking, however, does not fit the "Python way". You better use n as an int, and if it throws an exception, catch it and act upon it.

Nikhil Chelliah
-1 because it won't support inheritance.
bruno desthuilliers
-1 because typechecking is useless.
nosklo
Come on, guys. The answer was helpful re the question, although not very much. Instead of downvoting, fix it, because if many people agree that type-checking is not very Pythonic, the correct message might come through more easily.
ΤΖΩΤΖΙΟΥ
@ΤΖΩΤΖΙΟΥ - Partially agree
David
+1  A: 

Don't type check. The whole point of duck typing is that you shouldn't have to. For instance, what if someone did something like this:

class MyInt(int):
    # ... extra stuff ...
Evan Fosmark
If you use isinstance() you can cover that case. Some functions really need an integer and duck typing is just going to hide a possible bug.
Nick
+1: Not typecheck is way better. @Nick: can you give an example of such function? Why I can't pass a float to a function that needs an integer? One should use int() on the value instead of typechecking.
nosklo
@nosklo: if you have a function that requires an int as an argument, say because it is an array length or some other integer quantity, then using int() to squash it can hide bugs. In many cases, programs should fail early when given invalid input, not silently try to patch up the input.
Nick
+9  A: 

Yeah, as Evan said, don't type check. Just try to use the value:

def myintfunction(value):
   """ Please pass an integer """
   return 2 + value

That doesn't have a typecheck. It is much better! Let's see what happens when I try it:

>>> myintfunction(5)
7

That works, because it is an integer. Hm. Lets try some text.

>>> myintfunction('text')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in myintfunction
TypeError: unsupported operand type(s) for +: 'int' and 'str'

It shows an error, TypeError, which is what it should do anyway. If caller wants to catch that, it is possible.

What would you do if you did a typecheck? Show an error right? So you don't have to typecheck because the error is already showing up automatically.

Plus since you didn't typecheck, you have your function working with other types:

Floats:

>>> print myintfunction(2.2)
4.2

Complex numbers:

>>> print myintfunction(5j)
(2+5j)

Decimals:

>>> import decimal
>>> myintfunction(decimal.Decimal('15'))
Decimal("17")

Even completely arbitrary objects that can add numbers!

>>> class MyAdderClass(object):
...     def __radd__(self, value):
...             print 'got some value: ', value
...             return 25
... 
>>> m = MyAdderClass()
>>> print myintfunction(m)
got some value:  2
25

So you clearly get nothing by typechecking. And lose a lot.


UPDATE:

Since you've edited the question, it is now clear that your application calls some upstream routine that makes sense only with ints.

That being the case, I still think you should pass the parameter as received to the upstream function. The upstream function will deal with it correctly e.g. raising an error if it needs to. I highly doubt it your function that deals with IPs will behave strangely if you pass it a float. If you can give us the name of the library we can check that for you.

But... If the upstream function will behave incorrectly and kill some kids if you pass it a float (I still highly doubt it), then just just call int() on it:

def myintfunction(value):
   """ Please pass an integer """
   return upstreamfunction(int(value))

You're still not typechecking, so you get most benefits of not typechecking.


If even after all that, you really want to type check, despite it reducing your application's readability and performance for absolutely no benefit, use an assert to do it.

assert isinstance(...)
assert type() is xxxx

That way we can turn off asserts and remove this <sarcasm>feature</sarcasm> from the program by calling it as

python -OO program.py
nosklo
He said that the function won't work if the argument isn't an int. Having the function "work" with floats and complex numbers because it isn't typechecked is a bug, not a feature
Nick
@Nick: If it won't work if the argument isn't a int, what would it do instead? Raise error? Well, python does that already. Also, if the func works with other types like float then that isn't a bug, it's a feature, because the func is *working*. Making it not work would be an artificial restriction.
nosklo
@Nick: How can you *know* that the function won't work without an int? What if I provide an object that is not an int but behaves exactly as an int? Why that shouldn't work?
nosklo
@nosklo: The function may 'work' when passed another argument, but for the purposes of the problem, the result should be 'undefined'. I've never heard of a non-integral prefix length.
Murali Suriar
@Murali: Then just use int() on the value. That will turn it into an integer.
nosklo
@nosklo: Perhaps I'm not being clear? If the value received is not an int, then the input is invalid, and the program shouldn't be doing anything with it.
Murali Suriar
@Murali, yes, but why? This is an artificial restriction. Proof of a bad designed specification.
nosklo
@nosklo: The generally accepted format for IPv4 CIDR prefixes is A.B.C.D/E where A-D are integers in the range 0-255, and E is an int in the range 0-32. If the above conditions are not met, the input is malformed and should be ignored. Am I missing something obvious here? What makes this bad design?
Murali Suriar
@Murali: The part "should be ignored" is the bad design. Also, It seems like it would be the underlining library's job to treat those numbers correctly, not your library. And just calling int() on the value means you can pass strings, floats... and still meet the criteria so I don't see why not.
nosklo
@Murali: calling int() is faster, easier, readable code that would work in all cases, except for the artificial restriction "must be an integer or die". And you lose the ability of using other types. So not typechecking is a win-win
nosklo
@Murali: A better desing would be: The generally accepted format for IPv4 CIDR prefixes is A.B.C.D/E where A-D are integers in the range 0-255, and E is an int in the range 0-32. If the above conditions are not met, and the values can't be easily converted to integers, an error should be raised.
nosklo
@nosklo: That's a different design; not necessarily better. If the values need to be converted to integers (easily or not), then the input does not conform to the expected format; what else could be wrong with the input? I choose to ignore it early, rather than cater for every possible error later.
Murali Suriar
@Murali: Okay, if you prefer. Added a way to do it in a way you can turn it off if you want, at the end of my response.
nosklo
nosklo: I don't understand how the "upstreamfunction(int(value))" part solves the problem where he says that non-integer data is INVALID. As you well know that would just convert it to an int which might not be what you are after at all.Also you could try to be less of an asshole when replying.
kigurai
@kigurai: it solves the problem because it ensures the "invalid" data won't be passed to the upstream function, by converting anything to int first. If it is not what the caller wants, he should pass me an int as documented anyway, so that's doing my best.
nosklo
@kigurai: and I only recommend "upstreamfunction(int(value))" if the upstreamfunction has side effects that will cause computer explosion when not passing an int. That is very unlikeky. Typechecking is useless in almost all cases.
nosklo
@nosklo: if you have a function that requires an int as an argument, say because it is an array length or some other integer quantity, then using int() to squash it can hide bugs. In many cases, programs should fail early when given invalid input, not silently try to patch up the input.
Nick
nick, sure, then don't use int() in this case. Just pass the value unspoiled. The list type will raise the error automatically for you, no need to check.
nosklo
@nick, point is, there is no place for typechecking, unless you add "artificial restrictions".
nosklo
+1  A: 

Programming in Python and performing typechecking as you might in other languages does seem like choosing a screwdriver to bang a nail in with. It is more elegant to use Python's exception handling features.

From an interactive command line, you can run a statement like:

int('sometext')

That will generate an error - ipython tells me:

<type 'exceptions.ValueError'>: invalid literal for int() with base 10: 'sometext'

Now you can write some code like:

try:
   int(myvar) + 50
except ValueError:
   print "Not a number"

That can be customised to perform whatever operations are required AND to catch any errors that are expected. It looks a bit convoluted but fits the syntax and idioms of Python and results in very readable code (once you become used to speaking Python).

basswulf
A: 

I would be tempted to to something like:

def check_and_convert(x):
    x = int(x)
    assert 0 <= x <= 255, "must be between 0 and 255 (inclusive)"
    return x

class IPv4(object):
    """IPv4 CIDR prefixes is A.B.C.D/E where A-D are 
       integers in the range 0-255, and E is an int 
       in the range 0-32."""

    def __init__(self, a, b, c, d, e=0):
        self.a = check_and_convert(a)
        self.b = check_and_convert(a)
        self.c = check_and_convert(a)
        self.d = check_and_convert(a)
        assert 0 <= x <= 32, "must be between 0 and 32 (inclusive)"
        self.e = int(e)

That way when you are using it anything can be passed in yet you only store a valid integer.

James Brooks