tags:

views:

1132

answers:

11

The purpose of my question is to strengthen my knowledge base with Python and get a better picture of it, which includes knowing its faults and surprises. To keep things specific, I'm only interested in the CPython interpreter.

I'm looking for something similar to what learned from my PHP landmines question where some of the answers were well known to me but a couple were borderline horrifying.

Update: Apparently one maybe two people are upset that I asked a question that's already partially answered outside of Stack Overflow. As some sort of compromise here's the URL http://www.ferg.org/projects/python_gotchas.html

Note that one or two answers here already are original from what was written on the site referenced above.

+5  A: 

The only gotcha/surprise I've dealt with is with CPython's GIL. If for whatever reason you expect python threads in CPython to run concurrently... well they're not and this is pretty well documented by the Python crowd and even Guido himself.

A long but thorough explanation of CPython threading and some of the things going on under the hood and why true concurrency with CPython isn't possible. http://jessenoller.com/2009/02/01/python-threads-and-the-global-interpreter-lock/

David
check out the new multiprocessing module available in 2.6 for thread-like handling using separate processes if the GIL is bothering you. http://docs.python.org/library/multiprocessing.html
monkut
@monkut - Definitely looks cool, I swore I was reading about something similar to this module a year or so ago.
David
@David - must have been pyprocessing which has been made part of the standard libraries under the guise of multiprocessing
Ravi
+10  A: 

Dynamic binding makes typos in your variable names surprisingly hard to find. It's easy to spend half an hour fixing a trivial bug.

EDIT: an example...

for item in some_list:
    ... # lots of code
... # more code
for tiem in some_other_list:
    process(item) # oops!
Algorias
+1 Yeah that's kind of screwed me up once or twice, any chance you could provide an example in your answer though?
David
It's kinda your fault if you have "lots of code" in one function ..
hasen j
I suppose so, but this was just for illustration's sake. Actual ocurrences of this type of bug tend to be a bit more involved.
Algorias
+29  A: 

Expressions in default arguments are calculated when the function is defined, not when it’s called.

Example: consider defaulting an argument to the current time:

>>>import time
>>> def report(when=time.time()):
...     print when
...
>>> report()
1210294387.19
>>> time.sleep(5)
>>> report()
1210294387.19

The when argument doesn't change. It is evaluated when you define the function. It won't change until the application is re-started.

Strategy: you won't trip over this if you default arguments to None and then do something useful when you see it:

>>> def report(when=None):
...     if when is None:
...         when = time.time()
...     print when
...
>>> report()
1210294762.29
>>> time.sleep(5)
>>> report()
1210294772.23

Exercise: to make sure you've understood: why is this happening?

>>> def spam(eggs=[]):
...     eggs.append("spam")
...     return eggs
...
>>> spam()
['spam']
>>> spam()
['spam', 'spam']
>>> spam()
['spam', 'spam', 'spam']
>>> spam()
['spam', 'spam', 'spam', 'spam']
Garth T Kidd
+1 Excellent point! I actually have relied on this in a similar context, but I could easily see this catching the unwary off guard!
David
That's the most well known gotcha, but I've never been bitten by it before knowing it.
hasen j
The same is true for class level variables (an easy mistake to make when first learning python)
Richard Levasseur
The Python designers made a lot of good design decisions, but this was no one of them. +1
BlueRaja - Danny Pflughoeft
I give up/ Why is it happening?
Geoffrey Van Wyk
The default argument is created only once: when the function is defined. It gets re-used every time the function is called. In this case, the default argument is a list. So, what happens each time the function is called?
Garth T Kidd
+20  A: 

You should be aware of how class variables are handled in Python. Consider the following class hierarchy:

class AAA(object):
    x = 1

class BBB(AAA):
    pass

class CCC(AAA):
    pass

Now, check the output of the following code:

>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
>>> print AAA.x, BBB.x, CCC.x
3 2 3

Surprised? You won't be if you remember that class variables are internally handled as dictionaries of a class object. If a variable name is not found in the dictionary of current class, the parent classes are searched for it. So, the following code again, but with explanations:

# AAA: {'x': 1}, BBB: {}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 1 1
>>> BBB.x = 2
# AAA: {'x': 1}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
1 2 1
>>> AAA.x = 3
# AAA: {'x': 3}, BBB: {'x': 2}, CCC: {}
>>> print AAA.x, BBB.x, CCC.x
3 2 3

Same goes for handling class variables in class instances (treat this example as a continuation of the one above):

>>> a = AAA()
# a: {}, AAA: {'x': 3}
>>> print a.x, AAA.x
3 3
>>> a.x = 4
# a: {'x': 4}, AAA: {'x': 3}
>>> print a.x, AAA.x
4 3
DzinX
+9  A: 

Loops and lambdas (or any closure, really): variables are bound by name

funcs = []
for x in range(5):
  funcs.append(lambda: x)

[f() for f in funcs]
# output:
# 5 5 5 5 5

A work around is either creating a separate function or passing the args by name:

funcs = []
for x in range(5):
  funcs.append(lambda x=x: x)
[f() for f in funcs]
# output:
# 1 2 3 4 5
Richard Levasseur
+8  A: 

There was a lot of discussion on hidden language features a while back: hidden-features-of-python. Where some pitfalls were mentioned (and some of the good stuff too).

Also you might want to check out Python Warts.

But for me, integer division's a gotcha:

>>> 5/2
2

You probably wanted:

>>> 5*1.0/2
2.5

If you really want this (C-like) behaviour, you should write:

>>> 5//2
2

As that will work with floats too (and it will work when you eventually go to Python 3):

>>> 5*1.0//2
2.0

GvR explains how integer division came to work how it does on the history of Python.

Tom Dunham
Definitely a gotcha. It's gotten so than adding "from __future__ import division" to every new .py file I create is practically a reflex.
Chris Upchurch
Makes sense supposing that 5 and 2 are actually variables. Otherwise you could just write 5./2
Algorias
Why are you multiplying by 1.0? Wouldn't it be just as easy to make 5 be 5.0 or float(5) in case 5 is hidden in a variable.
Casey
"The correct work-around is subtle: casting an argument to float() iswrong if it could be a complex number; adding 0.0 to an argumentdoesn't preserve the sign of the argument if it was minus zero. Theonly solution without either downside is multiplying an argument(typically the first) by 1.0. This leaves the value and signunchanged for float and complex, and turns int and long into a floatwith the corresponding value."(PEP 238 - http://www.python.org/dev/peps/pep-0238/)
Tom Dunham
Tom, thanks...guess i didn't know this :)
Casey
+4  A: 

James Dumay eloquently reminded me of another Python gotcha:

Not all of Python's “included batteries” are wonderful.

James’ specific example was the HTTP libraries: httplib, urllib, urllib2, urlparse, mimetools, and ftplib. Some of the functionality is duplicated, and some of the functionality you'd expect is completely absent, e.g. redirect handling. Frankly, it's horrible.

If I ever have to grab something via HTTP these days, I use the urlgrabber module forked from the Yum project.

Garth T Kidd
I remember a couple years back giving up trying to accomplish what I wanted with the suite of tools above and ended up using pyCurl.
David
The fact that there's a module named urllib and a module named urllib2 still gets under my skin.
Jason Baker
This is probably the real reason for Python 3 :) They got to the point of, 'wait, where's the... let's start over'.
orokusaki
+3  A: 

Floats are not printed at full precision by default (without repr):

x = 1.0 / 3
y = 0.333333333333
print x  #: 0.333333333333
print y  #: 0.333333333333
print x == y  #: False

repr prints too many digits:

print repr(x)  #: 0.33333333333333331
print repr(y)  #: 0.33333333333300003
print x == 0.3333333333333333  #: True
pts
This is a compromise so that the float string is reasonably portable across python's platforms, since python uses hardware floats.
kaizer.se
+6  A: 

Not including an __init__.py in your packages. That one still gets me sometimes.

Jason Baker
+3  A: 

Unintentionally mixing oldstyle and newstyle classes can cause seemingly mysterious errors.

Say you have a simple class hierarchy consisting of superclass A and subclass B. When B is instantiated, A's constructor must be called first. The code below correctly does this:

class A(object):
    def __init__(self):
        self.a = 1

class B(A):
    def __init__(self):
        super(B, self).__init__()
        self.b = 1

b = B()

But if you forget to make A a newstyle class and define it like this:

class A:
    def __init__(self):
        self.a = 1

you get this traceback:

Traceback (most recent call last):
  File "AB.py", line 11, in <module>
    b = B()
  File "AB.py", line 7, in __init__
    super(B, self).__init__()
TypeError: super() argument 1 must be type, not classobj

Two other questions relating to this issue are 489269 and 770134

Dawie Strauss
+2  A: 

You cannot use locals()['x'] = whatever to change local variable values as you might expect.

This works:

>>> x = 1
>>> x
1
>>> locals()['x'] = 2
>>> x
2

BUT:

>>> def test():
...     x = 1
...     print x
...     locals()['x'] = 2
...     print x  # *** prints 1, not 2 ***
...
>>> test()
1
1

This actually burnt me in an answer here on SO, since I had tested it outside a function and got the change I wanted. Afterwards, I found it mentioned and contrasted to the case of globals() in "Dive Into Python." See example 8.12 in:

http://diveintopython.org/html_processing/locals_and_globals.html

(Though it does not note that the change via locals() will work at the top level as I show above.)

Anon
locals() at module level is the same thing as globals() anywhere in the module, is it not? It notes that globals() will take the change.
kaizer.se