views:

542

answers:

5

My background is C and C++. I like Python a lot, but there's one aspect of it (and other interpreted languages I guess) that is really hard to work with when you're used to compiled languages.

When I've written something in Python and come to the point where I can run it, there's still no guarantee that no language-specific errors remain. For me that means that I can't rely solely on my runtime defense (rigorous testing of input, asserts etc.) to avoid crashes, because in 6 months when some otherwise nice code finally gets run, it might crack due to some stupid typo.

Clearly a system should be tested enough to make sure all code has been run, but most of the time I use Python for in-house scripts and small tools, which ofcourse never gets the QA attention they need. Also, some code is so simple that (if your background is C/C++) you know it will work fine as long as it compiles (e.g. getter-methods inside classes, usually a simple return of a member variable).

So, my question is the obvious - is there any way (with a special tool or something) I can make sure all the code in my Python script will "compile" and run?

A: 

I think what you are looking for is code test line coverage. You want to add tests to your script that will make sure all of your lines of code, or as many as you have time to, get tested. Testing is a great deal of work, but if you want the kind of assurance you are asking for, there is no free lunch, sorry :( .

Adam Luter
He is not looking for code to pass tests. He already said, "in 6 months when the otherwise nice code finally gets run, it might simply crack due to some typo." Tests check whether the code does "the right thing" for some finite input set, not whether it uses valid syntax throughout (what the OP wants)
Matthew Flaschen
It won't pass many tests if it has typos. If your coverage touches every line of code (not every logic path), you'll be reasonably sure that it will work reliably.
S.Lott
-1. I'm sorry Adam, the question suggests such QA efforts as rather unrealistic, hence the answer is of little help.
sharkin
"every line of code" doesn't buy you nearly as much as people think. Trivial example posted in my answer.
Matthew Flaschen
While "every line of code" is not isomorphic to "perfect", it covers as many bases as a C++ compiler covers. C++ code can compile and be full of holes that don't surface until the program is abused in production. A simple set of unit tests will give you tremendous confidence at very, very low cost. Python is so easy to write that the incremental cost of a few unit tests is still (often) cheaper to develop than C++.
S.Lott
It does not cover as many bases. C++ programs have their own issues, but any C++ compiler will catch these kinds of errors (undeclared variable). Unit tests are very valuable, but they're not enough (for any language).
Matthew Flaschen
Compiler is very valuable, but it's not enough (for any language).
S.Lott
I respectfully disagree, Matthew. I meant line coverage tests, which do not focus on testing functionality, but rather try to trigger every line of code. In your example for your answer, your typo is on it's own logical line, just not on it's own physical line. Line test coverage *would* find this. I think the tools you pointed out will help find mistakes too, though. The point remains that there is still no free lunch.R.A., I don't suggest you don't use these tools, nor do I suggest you do line-coverage tests. Rather I just state you are at an impasse given resources and requirements.
Adam Luter
There is no agreed upon definition of logical line. I fail to see why you think he shouldn't use these tools. I think he does have the necessary resources.
Matthew Flaschen
S. Lott, obviously the compiler isn't enough either.
Matthew Flaschen
I specifically said he should (albiet, with a double negative). Anyway, please don't mince words, line-coverage tests would work, if you'd like: remove the word 'line'.
Adam Luter
+19  A: 

Look at PyChecker and PyLint.

Here's example output from pylint, resulting from the trivial program:

print a

As you can see, it detects the undefined variable, which py_compile won't (deliberately).

in foo.py:

************* Module foo
C:  1: Black listed name "foo"
C:  1: Missing docstring
E:  1: Undefined variable 'a'


...

|error      |1      |1        |=          |

Trivial example of why tests aren't good enough, even if they cover "every line":

bar = "Foo"
foo = "Bar"
def baz(X):
    return bar if X else fo0

print baz(input("True or False: "))

EDIT: PyChecker handles the ternary for me:

Processing ternary...
True or False: True
Foo

Warnings...

ternary.py:6: No global (fo0) found
ternary.py:8: Using input() is a security problem, consider using raw_input()
Matthew Flaschen
Good recommendation of pychecker and pylint. pyflakes is also good because it's very fast, and the svn trunk version will catch unused local variables. As for testing "every line," I think you should at least test every "path." That would have caught your blow-up example.
Ryan Ginstrom
Works great Matthew, thanks!
sharkin
It is impossible to identify (and then test) every possible logic path, because that is equivalent to the halting problem (http://en.wikipedia.org/wiki/Code_coverage).
Matthew Flaschen
You can certainly ensure that almost every piece of code is hit at least once. You can't test every logic path for even a slightly complex program, which is why I put "path" in quotes. Making sure every piece of code was hit in your example would have caught the error.
Ryan Ginstrom
"You can certainly ensure that almost every piece of code is hit at least once." That really doesn't seem to mean anything in particular. Almost? Piece of code?
Matthew Flaschen
+1 because these tools definitely help. But line-coverage would find the typo above if you test logical-lines, not physical-lines. Also, I am confused as to why PyChecker is complaining about 'a' but not 'fo0'?
Adam Luter
Adam, it detects fo0 for me. And I don't think there's a solid definition for logical lines.
Matthew Flaschen
A: 

Your code actually gets compiled when you run it, the Python runtime will complain if there is a syntax error in the code. Compared to statically compiled languages like C/C++ or Java, it does not check whether variable names and types are correct – for that you need to actually run the code (e.g. with automated tests).

Alex Morega
-1. It seems there actually are tools around to discover errors like that, hence not needing to actually run the code to discover them.
sharkin
+1  A: 

If you are using Eclipse with Pydev as an IDE, it can flag many typos for you with red squigglies immediately, and has Pylint integration too. For example:

foo = 5
print food

will be flagged as "Undefined variable: food". Of course this is not always accurate (perhaps food was defined earlier using setattr or other exotic techniques), but it works well most of the time.

In general, you can only statically analyze your code to the extent that your code is actually static; the more dynamic your code is, the more you really do need automated testing.

Kiv
+1  A: 

Others have mentioned tools like PyLint which are pretty good, but the long and the short of it is that it's simply not possible to do 100%. In fact, you might not even want to do it. Part of the benefit to Python's dynamicity is that you can do crazy things like insert names into the local scope through a dictionary access.

What it comes down to is that if you want a way to catch type errors at compile time, you shouldn't use Python. A language choice always involves a set of trade-offs. If you choose Python over C, just be aware that you're trading a strong type system for faster development, better string manipulation, etc.

Imagist