views:

111

answers:

3

I have a program that outputs some lists that I want to store to work with later. For example, suppose it outputs a list of student names and another list of their midterm scores. I can store this output in the following two ways:

Standard File Output way:

newFile = open('trialWrite1.py','w')
newFile.write(str(firstNames))
newFile.write(str(midterm1Scores))
newFile.close()

The pickle way:

newFile = open('trialWrite2.txt','w')
cPickle.dump(firstNames, newFile)
cPickle.dump(midterm1Scores, newFile)
newFile.close()

Which technique is better or preferred? Is there an advantage of using one over the other?

Thanks

+1  A: 

pickle is more generic -- it allows you to dump many different kinds of objects to a file for later use. The downside is that the interim storage is not very human-readable, and not in a standard format.

Writing strings to a file, on the other hand, is a much better interface to other activities or code. But it comes at the cost of having to parse the text back into your Python object again.

Both are fine for this simple (list?) data; I would use write( firstNames ) simply because there's no need to use pickle. In general, how to persist your data to the filesystem depends on the data!


For instance, pickle will happily pickle functions, which you can't do by simply writing the string representations.

>>> data = range
<class 'range'>
>>> pickle.dump( data, foo )
# stuff
>>> pickle.load( open( ..., "rb" ) )
<class 'range'.
katrielalex
Jeremy Brown
Thanks for the reply. I will either write the string representation or use CSV as David has suggested.
Curious2learn
@Jeremy: I didn't realise `pickle` didn't like user-defined functions -- thanks!
katrielalex
Jeremy Brown
+2  A: 

I think the csv module might be a good fit here, since CSV is a standard format that can be both read and written by Python (and many other languages), and it's also human-readable. Usage could be as simple as

with open('trialWrite1.py','wb') as fileobj:
    newFile = csv.writer(fileobj)
    newFile.writerow(firstNames)
    newFile.writerow(midterm1Scores)

However, it'd probably make more sense to write one student per row, including their name and score. That can be done like this:

from itertools import izip
with open('trialWrite1.py','wb') as fileobj:
    newFile = csv.writer(fileobj)
    for row in izip(firstNames, midterm1Scores):
        newFile.writerow(row)
David Zaslavsky
I think `writerow` expects a tuple not a stringified collection.
katrielalex
According to the documentation, it'll take any sequence, including lists or tuples.
David Zaslavsky
If I use the line `newFile.close()`, I get an error saying that " '_csv.writer' object has no attribute 'close' "
Curious2learn
@David Zaslavsky: For some reason I thought you'd passed it `str( firstNames )`. Maybe you edited before I saw it :/? @Curious2learn: close the file, not the writer.
katrielalex
@katrielalex: ah yes, you probably saw it before I'd edited. I had copied from the question and I forgot to fix that part at first.
David Zaslavsky
@Curious2learn: sorry about that, I always forget that CSV writers don't have `close` methods. I've edited the code and I think it should work now. The new way uses the `with` statement available in Python 2.6+ that automatically closes the file at the end of the block.
David Zaslavsky
A: 

For a completely different approach, consider that Python ships with SQLite. You could store your data in a SQL database without adding any third-party dependencies.

Just Some Guy
I need to use the data to plot graphs with matplotlib. I could import from SQlite but that seems too much work to export and import it.
Curious2learn