I am trying to serialize a list of dictionaries to a csv text file using Python's CSV module. My list has about 13,000 elements, each is a dictionary with ~100 keys consisting of simple text and numbers. My function "dictlist2file" simply calls DictWriter to serialize this, but I am getting out of memory errors.
My function is:
def dictlist2file(dictrows, filename, fieldnames, delimiter='\t',
lineterminator='\n', extrasaction='ignore'):
out_f = open(filename, 'w')
# Write out header
if fieldnames != None:
header = delimiter.join(fieldnames) + lineterminator
else:
header = dictrows[0].keys()
header.sort()
out_f.write(header)
print "dictlist2file: serializing %d entries to %s" \
%(len(dictrows), filename)
t1 = time.time()
# Write out dictionary
data = csv.DictWriter(out_f, fieldnames,
delimiter=delimiter,
lineterminator=lineterminator,
extrasaction=extrasaction)
data.writerows(dictrows)
out_f.close()
t2 = time.time()
print "dictlist2file: took %.2f seconds" %(t2 - t1)
When I try this on my dictionary, I get the following output:
dictlist2file: serializing 13537 entries to myoutput_file.txt
Python(6310) malloc: *** mmap(size=45862912) failed (error code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
Traceback (most recent call last):
...
File "/Library/Frameworks/Python.framework/Versions/6.2/lib/python2.6/csv.py", line 149, in writerows
rows.append(self._dict_to_list(rowdict))
File "/Library/Frameworks/Python.framework/Versions/6.2/lib/python2.6/csv.py", line 141, in _dict_to_list
return [rowdict.get(key, self.restval) for key in self.fieldnames]
MemoryError
Any idea what could be causing this? The list has only 13,000 elements and the dictionaries themselves are very simple and small (100 keys) so I don't see why this would lead to memory errors or be so inefficient. It takes minutes for it to get to the memory error.
thanks for your help.