views:

3901

answers:

6

I am starting to use Python (specifically because of Django) and I would like to remove the burden for exhaustive testing by performing some static analysis. What tools/parameters/etc. exist to detect issues at compile time that would otherwise show up during runtime? (type errors are probably the most obvious case of this, but undefined variables are another big one that could be avoided with an in-depth analysis of the AST.)

Obviously testing is important, and I don't imply that tests can be obviated entirely; however, there are many runtime errors in python that are not possible in other languages that perform stricter run-time checking -- I'm hoping that there are tools to bring at least some of these capabilities to python as well.

+2  A: 

There's

And probably others, too.

Dan
+22  A: 

pylint is the best such tool I've found. Due to Python's nature it's difficult to statically analyze it, but it will catch undefined variables, basic type errors, unused code, etc. You'll want to tweak the configuration file, as by default it outputs many warnings I consider useless or harmful.

Here's part of my .pylintrc dealing with warning silencing:

# Brain-dead errors regarding standard language features
#   W0142 = *args and **kwargs support
#   W0403 = Relative imports

# Pointless whinging
#   R0201 = Method could be a function
#   W0212 = Accessing protected attribute of client class
#   W0613 = Unused argument
#   W0232 = Class has no __init__ method
#   R0903 = Too few public methods
#   C0301 = Line too long
#   R0913 = Too many arguments
#   C0103 = Invalid name
#   R0914 = Too many local variables

# PyLint's module importation is unreliable
#   F0401 = Unable to import module
#   W0402 = Uses of a deprecated module

# Already an error when wildcard imports are used
#   W0614 = Unused import from wildcard

# Sometimes disabled depending on how bad a module is
#   C0111 = Missing docstring

# Disable the message(s) with the given id(s).
disable-msg=W0142,W0403,R0201,W0212,W0613,W0232,R0903,W0614,C0111,C0301,R0913,C0103,F0401,W0402,R0914
John Millikin
The above is under section header [MESSAGES CONTROL]
Zitrax
+4  A: 

You should check out Pyflakes, Pylint, and PyChecker. I've personally used both Pyflakes and Pylint, and found them both to be very helpful for catching those little things you hate to mess up on. Pylint generally requires a bit more configuration than Pyflakes.

Also noteworthy: Eclipse's PyDev plugin comes in with a built in Pylint output parser.

cdleary
Pydev is a great environment made better with pylint turned on. All of the pylint warnings show up as icons in the margin so you don't have to go hunting through a text report. All you have to do is hover over the icon to see what you've done wrong :D
Seth
+2  A: 

I echo the other answers and would just add that pychecker is the quickest and easiest to use and pylint the most comprehensive and configurable.

I also use epydoc a fair bit and this is good for pointing out problems with your docstrings.

andy47
+18  A: 

Here are my first impressions of pyflakes, pychecker and pylint:

  • pychecker: It crashes frequently, most of the runs I tried resulted in Errors that originated in the pychecker code (eg: AttributeError or IndexError: list index out of range were the most common). For some reason I had to set the DJANGO_SETTINGS_MODULE environment variable before it would even run on any of the app code, and the documentation is very sparse.

  • pyflakes: 'pyflakes --help' throws a TypeError -- erm... Documentation is also very sparse, and pyflakes is very forgiving (as far as I can tell, it only reports compile errors, warnings, redefinitions, and some concerns about imports--such as unused and wildcards). pyflakes also seems to repeat itself:

    eventlist/views.py:4: 'Http404' imported but unused
    eventlist/views.py:4: 'Http404' imported but unused
    eventlist/views.py:5: 'from eventlist.models import *' used; unable to detect undefined names eventlist/views.py:59: 'authenticate' imported but unused
    eventlist/views.py:61: redefinition of unused 'login' from line 59
    eventlist/views.py:5: 'from eventlist.models import *' used; unable to detect undefined names
    eventlist/views.py:4: 'Http404' imported but unused

  • pylint: This seems to be the most capable of the tools suggested. It has the best documentation. LogiLab provides a tutorial, pylint has a help screen, and there is a (broken) link to a user manual, which would be extremely helpful. There are some issues with applying pylint to django, since pylint doesn't know about the django classes (such as models.Model). This means that a fair number of otherwise valuable errors are generated about missing class fields. eg:

    E:105:get_events_by_tag: Class 'Tag' has no 'objects' member

    Parsing these out automatically will be very difficult without some additional knowledge of the classes in use. I'm not sure adding that is feasible, but it does seem likely that pylint is capable of dealing with this in the "right" way. (I probably just need to point it to the django source, but there are no command line params that look likely, and, as mentioned earlier, the user manual is inaccessible.)

For the moment, I'm still looking into pylint -- pychecker and pyflakes need better documentation and they need to become more robust.

rcreswick
If you set PYTHONPATH before running pylint to point to the django libraries, does that fix the unknown classes? Maybe django imports "behind the scenes" some stuff that pylint just can't know about, in which case you would have to add your own explicit imports for those.
rq
+1  A: 

See CloneDR, a tool for detecting duplicated code in spite of formatting changes, comment insertion/deletions, and even some changes to the code itself (replacement of one statement by another).

CloneDR works for a wide variety of languages (C, C++, C#, COBOL, Java, PHP, ...). We recently added Python 2.6, and Python 3.0 isn't far off.

Ira Baxter