views:

1543

answers:

6

Hello,

I want to do some basic filtering on a file. Read it, do processing, write it back.

I'm not looking for "golfing", but want the simplest and most elegant method to achieve this. I came up with:

from __future__ import with_statement

filename = "..." # or sys.argv...

with open(filename) as f:
    new_txt = # ...some translation of f.read() 

open(filename, 'w').write(new_txt)

The with statement makes things shorter since I don't have to explicitly open and close the file.

Any other ideas ?

+2  A: 

This seems to work:

with open(filename, "r+") as f:
    new_txt = process(f.read())
    f.truncate(0)
    f.write(new_txt)
Federico Ramponi
+2  A: 

I would go for elegance a different way: implement your file-reading and filtering operations as generators, You'll write more lines of code, but it will be more flexible, maintainable, and performant code.

See David M. Beazley's Generator Tricks for Systems Programmers, which is a really important thing for anyone who's writing this kind of code to read.

Robert Rossney
Excellent link -- thank you! I am a little concerned with the increased difficulty in debugging pipelines, but the power is undeniable.
Kevin Little
Test-driven development is your friend.
Robert Rossney
+1  A: 

If you're looking for the python equivalent of "perl -pi", here's a pretty good one:

import fileinput
for line in fileinput.input():
   # process line

See http://www.python.org/doc/2.5.2/lib/module-fileinput.html for more.

Done this way, you would use your python script in a pipe to create the new file:

$ myscript.py infile.txt > outfile.txt
It doesn't really help me though, since I want to write back to the same file. And redirection won't work this way for the same file
Eli Bendersky
+1  A: 

To do it in a way which won't eat your data if you crash in the middle:

from twisted.python.filepath import FilePath
p = FilePath(filename)
p.setContent(process(p.getContent()))
Glyph
+13  A: 

Actually an easier way using fileinput is to use the inplace parameter:

import fileinput
for line in fileinput.input (filenameToProcess, inplace=1):
    process (line)

If you use the inplace parameter it will redirect stdout to your file, so that if you do a print it will write back to your file.

This example adds line numbers to your file:

import fileinput

for line in fileinput.input ("b.txt",inplace=1):
    print "%d: %s" % (fileinput.lineno(),line),
Hortitude
Very nice, thanks for pointing this option out. You can also use the filelineno() function from fileinput to automatically have the line number, without counting it yourself.
Eli Bendersky
Oh, and you forgot the comma after the print - the code adds extra newlines :-)
Eli Bendersky
Thanks for catching that -- I have changed the example.
Hortitude
A: 

My ugly (but short as stated in the question) solution with generator expressions;

# Some setup first
file('test.txt', 'w').write('\n'.join('%05d' % i for i in range(100)))


# This is the filter function
def f(i):
    return i % 3


# This is the main part 
file('test2.txt', 'w').write('\n'.join(str(f(int(l))) for l in file('test.txt', 'r').readlines()))


# And a wrapper for sanity
def filter_file(infile, outfile, filter_function)
    outfile.write('\n'.join(filter_function(l) for l in infile.readlines()))
muhuk