views:

484

answers:

3

The function below takes a python file handle, reads in packed binary data from the file, creates a Python dictionary and returns it. If I loop it endlessly, it'll continually consume RAM. What's wrong with my RefCounting?

static PyObject* __binParse_getDBHeader(PyObject *self, PyObject *args){

PyObject *o; //generic object
PyObject* pyDB = NULL; //this has to be a py file object

if (!PyArg_ParseTuple(args, "O", &pyDB)){
 return NULL;
} else {
 Py_INCREF(pyDB);
 if (!PyFile_Check(pyDB)){
        Py_DECREF(pyDB);
  PyErr_SetString(PyExc_IOError, "argument 1 must be open file handle");
  return NULL;
 }
}

FILE *fhDB = PyFile_AsFile(pyDB);

long offset = 0;
DB_HEADER *pdbHeader = malloc(sizeof(DB_HEADER));
fseek(fhDB,offset,SEEK_SET); //at the beginning
fread(pdbHeader, 1, sizeof(DB_HEADER), fhDB );
if (ferror(fhDB)){
 fclose(fhDB);
 Py_DECREF(pyDB);
 PyErr_SetString(PyExc_IOError, "failed reading database header");
 return NULL;
}
Py_DECREF(pyDB);

PyObject *pyDBHeader = PyDict_New();
Py_INCREF(pyDBHeader);

o=PyInt_FromLong(pdbHeader->version_number);
PyDict_SetItemString(pyDBHeader, "version", o);
Py_DECREF(o);

PyObject *pyTimeList = PyList_New(0);
Py_INCREF(pyTimeList);

int i;
for (i=0; i<NUM_DRAWERS; i++){
 //epochs
 o=PyInt_FromLong(pdbHeader->last_good_test[i]);
 PyList_Append(pyTimeList, o);
 Py_DECREF(o);
}
PyDict_SetItemString(pyDBHeader, "lastTest", pyTimeList);
Py_DECREF(pyTimeList);

o=PyInt_FromLong(pdbHeader->temp);
PyDict_SetItemString(pyDBHeader, "temp", o);
Py_DECREF(o);

free(pdbHeader);
return (pyDBHeader);
}

Thanks for taking a look,

LarsenMTL

+9  A: 

PyDict_New() returns a new reference, check the docs for PyDict. So if you increase the refcount immediately after creating it, you have two references to it. One is transferred to the caller when you return it as a result value, but the other one never goes aways.

You also don't need to incref pyTimeList. It's yours when you create it. However, you need to decref it, but you only decref it once, so it's leaked as well.

You also don't need to call Py_INCREF on pyDB. It's a borrowed reference and it won't go away as long as your function does not return, because it's still referenced in a lower stack frame.

Only if you want to keep the reference in another structure somewhere, you need to increse the refcount.

Cf. the API docs

Torsten Marek
Torsten, thanks, I just learned more in your 4 paragraphs, then I did staring at the docs all morning. I'll check all my borrowed vs return new references.
Mark
+4  A: 

OT: Using successive calls to PyList_Append is a performance issue. Since you know how many results you'll get in advance, you can use:

PyObject *pyTimeList = PyList_New(NUM_DRAWERS);
int i;
for (i=0; i<NUM_DRAWERS; i++){
    o = PyInt_FromLong(pdbHeader->last_good_test[i]);
    PyList_SET_ITEM(pyTimeList, i, o);
}

Observe that you may not decrease the refcount of o after calling PyList_SET_ITEM, because it "steals" a reference. Check the docs.

Torsten Marek
+1  A: 

I don't know about Python-C. However, My experience with COM reference counting says that a newly created reference-counted object has a reference count of 1. So your Py_INCREF(pyDB) after PyArg_ParseTuple(args, "O", &pyDB) and PyObject *pyDBHeader = PyDict_New(); are the culprit. Their reference counts are already 2.

yogman