views:

232

answers:

4

It seems that everytime I call a function that returns a PyObject*, I have to add four lines of error checking. Example:

py_fullname = PyObject_CallMethod(os, "path.join", "ss", folder, filename);
if (!py_fullname) {
    Py_DECREF(pygame);
    Py_DECREF(os);
    return NULL;
}
image = PyObject_CallMethodObjArgs(pygame, "image.load", py_fullname, NULL);
Py_DECREF(py_fullname);
if (!image) {
    Py_DECREF(pygame);
    Py_DECREF(os);
    return NULL;
}
image = PyObject_CallMethodObjArgs(image, "convert", NULL);
if (!image) {
    Py_DECREF(pygame);
    Py_DECREF(os);
    return NULL;
}

Am I missing something? Is there a better way to do this? This has the added problem that I may forget all the stuff I'm supposed to Py_DECREF().

+2  A: 

This is the C API. If you code in C only, you may have to live with it, but if your application is coded in C++, you might want a take a look at a C++/Python wrapper.

JP
+8  A: 

That's why goto is alive (though not entirely well;-) in C coding (as opposed to C++ and other languages that support exceptions): it's the only decent way to NOT have such repeated blocks of termination all over the mainline of your code -- a conditional forward jump to errorexit at each return-value check, with a label errorexit: that does the decrefs (and file closes, and whatever else you need to do at termination) and the return NULL.

Alex Martelli
"cosmetics" like `#define EXCIFZERO(x) if (!x) goto errorexit` can help
John Machin
A: 

This is one of the reasons why C++ introduces exception handling and RAII. Assuming you can use C++, you can create a function that invokes the C function, tests the result, and throws an exception if an error occurred. That way, you can call the wrapper without any checks... if an error occurs, it will throw the exception. However, there is no need to reinvent the wheel, check out the Boost.Python library.

Michael Aaron Safyan
A: 

Here are two ways I might write that code, influenced by my experience writing in two heavily macro-ised pseudo-assembler languages, one of which wasn't C. I've moved the deref of fullname, not because it's wrong in your code, but because I want to demonstrate how you handle a longer-lived resource in both schemes. So imagine that "fullname" is going to be needed again later in the routine:

Arrow code

result = NULL;
py_fullname = PyObject_CallMethod(os, "path.join", "ss", folder, filename);
if (py_fullname) {
    image = PyObject_CallMethodObjArgs(pygame, "image.load", py_fullname, NULL);
    if (image) {
        image = PyObject_CallMethodObjArgs(image, "convert", NULL);
        result = // something to do with image, presumably.
    }
    Py_DECREF(py_fullname);
}
Py_DECREF(pygame);
Py_DECREF(os);
return result;

The way this game is played, is that whenever you call a function which returns a resource, you check the return value immediately (or perhaps after freeing some resource which is no longer required, as in your example code), and the block corresponding to a successful call must either release the resource, or assign it to a return value, or actually return it, before the block is exited. This will usually be either in the second line of the block, after it has been used in the first line, or else in the last line of the block.

It's called "arrow code" because if you make 5 or 6 such calls in a function, you end up with 5 or 6 levels of indentation, and your function looks like a "turn right" sign. When this happens you either refactor, or you go against your every Pythonic instinct, use tabs for indentation, and reduce the tab stops ;-)

Goto

result = NULL;
py_fullname = PyObject_CallMethod(os, "path.join", "ss", folder, filename);
if (!py_fullname) goto cleanup_pygame

image = PyObject_CallMethodObjArgs(pygame, "image.load", py_fullname, NULL);
if (!image) goto cleanup_fullname

image = PyObject_CallMethodObjArgs(image, "convert", NULL);
result = // something to do with image, presumably.

cleanup_fullname:
    Py_DECREF(py_fullname);
cleanup_pygame:
    Py_DECREF(pygame);
    Py_DECREF(os);
    return result;

This goto code is structurally identical to the arrow code, just less indented and easier to mess up and jump to the wrong label. In some circumstances you will clean up different resources on success from what you clean up on failure (for instance if you're constructing and returning something, then on failure you need to clean up whatever you've done so far, but on success you only clean up what you're not returning). Those are the circumstances where the goto code is a clear win over the arrow code, because you can have separate cleanup paths for the two cases, but they still look the same, appear together at the end of the routine, and perhaps even share code. So you might end up with something like this:

result = NULL;
helper = allocate_something;
if (!helper) goto return_result;

result = allocate_something_else;
if (!result) goto error_return; // OK, result is already NULL, but it makes the point

result->contents = allocate_another_thing;
if (!result->contents) goto error_cleanup_result;

result->othercontents = allocate_last_thing;
if (!result->othercontents) goto error_cleanup_contents;

free_helper:
    free(helper);
return_result:
    return result;

error_cleanup_contents:
    free(result->contents);
error_cleanup_result:
    free(result);
error_return;
    result = NULL;
    goto free_helper;

Yes, it's horrible, and Python or C++ programmers will be physically sick at the sight of it. If I never have to write code like this again, I won't be all that disappointed. But as long as you have a systematic scheme for how resources are cleaned up, you should always know exactly what error label to jump to when something goes wrong, and that error label should "know" to clean up all resources that have been allocated so far. Doing it in reverse order allows fall-through to share the code. And once you're used to it, it's reasonably easy to do two things: first follow the path from any given error label, to the exit, and confirm that everything which should be freed is freed. Second, see the difference between two error cases, and confirm that this is the correct difference between the error-handling needed, because the difference is precisely to free the thing that was allocated in between the jumps to those labels.

That said, a semi-decent optimising compiler will common up the code for the error cases in your example. It's just easier to make a mistake when you have code copy-and-pasted about the place like that, especially when you modify it later.

Steve Jessop