views:

88

answers:

4

I recently came across a simple but nasty bug. I had a list and I wanted to find the smallest member in it. I used Python's built-in min(). Everything worked great until in some strange scenario the list was empty (due to strange user input I could not have anticipated). My application crashed with a ValueError (BTW - not documented in the official docs).

I have very extensive unit tests and I regularly check coverage to avoid surprises like this. I also use Pylint (everything is integrated in PyDev) and I never ignore warnings, yet I failed to catch this bug before my users did.

Is there anything I can change in my methodology to avoid these kind of runtime errors? (which would have been caught at compile time in Java / C#?).

I'm looking for something more than wrapping my code with a big try-except. What else can I do? How many other build in Python functions are hiding nasty surprises like this???

+5  A: 

Even in Java/C#, a class of exceptions the RuntimeError are unchecked and will not be detected by the compiler (that's why they're called RuntimeError not CompileError).

In python, certain exceptions such as KeyboardInterrupt are particularly hairy since it can be raised practically at any arbitrary point in the program.

I'm looking for something more than wrapping my code with a big try-except.

Anything but that please. It is much better to let exceptions get to user and halt the program rather than letting error pass around silently (Zen of Python).

Unlike Java, Python does not require all Exceptions to be caught because requiring all Exceptions to be caught makes it too easy for programmers to ignore the Exception (by writing blank exception handler).

Just relax, let the error halt; let the user report it to you, so you can fix it. The other alternative is you stepping into a debugger for forty-two hours because customer's data is getting corrupted everywhere due to a blank mandatory exception handler.

So, what you should change in your methodology is thinking that exception is bad; they're not pretty, but they're better than the alternatives.

Lie Ryan
I agree 100% and this is the methodology I use. BUT - I would have loved to have a coverage tool that lets me run my unit tests and then tells me that I have never experienced certain paths through my code (such as this exception).
Tal Weiss
A: 

I don't know of a direct answer to your question; I, too, would love it if pylint warned about such possibilities. My general practice, given that empty lists cause problems in all sorts of situations, is to check lists for truth before using them; for example:

val = min(vals) if vals else 0

In many cases this is 'free', since you often need to check for None anyway. It can also pay off performance-wise to special-case empty lists, to avoid, i.e. starting a new thread, process or database transaction to process zero items.

DNS
Not downvoting, but what if the list is `[1,0,3,2,5]` ? You now have no means (in your construct) to differentiate between a valid minimum and the value caused by an empty list.
ChristopheD
@ChristopheD: That's the idea; often, especially here with *min*, an empty or even None list isn't really a problem. There's often some value, like zero, that you would intuitively use in that case. I just try to be aware of that when writing the code, much as I would be in dealing with None. Unfortunately 'more vigilance' only goes so far, so I admit it's not a true answer to the question.
DNS
there has been a discussion about adding sentinels to min/max http://bugs.python.org/issue7153 but they got rejected due to various reasons.
Lie Ryan
+7  A: 

The problem here is that malformed external input crashed your program. The solution is to exhaustively unit test possible input scenarios at the boundaries of your code. You say your unit tests are 'extensive', but you clearly hadn't tested for this possibility. Code coverage is a useful tool, but it's important to remember that covering code is not the same as thoroughly testing it. Thorough testing is a combination of covering usage scenarios as well as lines of code.

The methodology I use is to trust internal callers, but never to trust external callers or input. So I explicitly don't unit test for the empty list case in any code beyond the first function that receives the external input. But that input function should be exhaustively covered.

In this case I think the library's exception is reasonable behaviour - it makes no sense to ask for the min of an empty list. The library can't legitimately set a value such as 0 for you, since you may be dealing with negative numbers, for example.

I think the empty list should never have reached the code that asks for the min - it should have been identified at input, and either raised an exception there, or set it to 0 if that works for you, or whatever else it is that does work for you.

ire_and_curses
The funny thing about dealing with natural language is that you can't go anywhere near exhaustively covering the potential inputs. That's a reality I must deal with. I have no problem with the library behavior (other than the fact that it is not documented!).
Tal Weiss
A: 

You could have used randomized testing:

#!/usr/bin/env python
import random
from peckcheck import TestCase, an_int, main

def a_seq(generator):
    return lambda size: [generator(size) 
                         for _ in xrange(random.randrange(size))] 

class TestMin(TestCase):
    def testInputNoThrow(self, x=a_seq(an_int)):
        min(x)

if __name__=="__main__":
    main()

To install peckcheck, type:

$ pip install http://github.com/downloads/zed/peckcheck/peckcheck-0.1.v2.6.tar.gz

Or just grub peckcheck.py

J.F. Sebastian