views:

42

answers:

4

I'd like to learn to use python as a command line scripting replacement. I spent some time with python in the past but it's been a while. This seems to be within the scope of it.

I have several files in a folder that I want to do a search-and-replace on, within all of them. I'd like to do it with a python script.

For example, search and replace all instances of "foo" with "foobar".

A: 

Normally I'd whip out the old perl -pi -e 's/foo/foobar/' for this, but if you want Python:

import os
import re
_replace_re = re.compile("foo")
for dirpath, dirnames, filenames in os.walk("directory/"):
    for file in filenames:
        file = os.path.join(dirpath, file)
        tempfile = file + ".temp"
        with open(tempfile, "w") as target:
            with open(file) as source:
                for line in source:
                    line = _replace_re.sub("foobar", line)
                    target.write(line)
        os.rename(tempfile, file)

And if you're on Windows, you'll need to add an os.remove(file) before the os.rename(tempfile, file).

David Wolever
Also, it might be good to put in a little check to verify that the `tempfile` doesn't already exist…
David Wolever
This seems to make sense. Is the act of creating the temp file just so if permissions don't suffice, we can still perform the action? In that case, the remove and rename won't work either, correct?
fruit
The tempfile makes sure that we don't overwrite the real file too early and so that we don't use up lots of memory on a large file (the naieve way to do it would be something like: `data = open(file).read(); data = _replace_re.sub("foobar", data); open(file, "w").write(data)`, but that would use lots of memory and, if the computer crashed half way through the `write`, you'd loose the un-written data)
David Wolever
+1  A: 

I worked through it and this seems to work, but any errors that can be pointed out would be awesome.

import fileinput, sys, os

def replaceAll(file, findexp, replaceexp):
    for line in fileinput.input(file, inplace=1):
        if findexp in line:
            line = line.replace(findexp, replaceexp)
        sys.stdout.write(line)

if __name__ == '__main__':
    files = os.listdir("c:/testing/")
    for file in files:
        newfile = os.path.join("C:/testing/", file)
        replaceAll(newfile, "black", "white")

an expansion on this would be to move to folders within folders.

fruit
What you might want to do is change that to `replaceAll(file, "black", "white")` - as it stands if you ever have `somedir/blackdir/blackfile.txt` then you'll get `somedir/whitedir/whitefile.txt`. Unless of course you want that, in which case leave it just how you have it.
Wayne Werner
Why would this function rename files? It's searching it line by line..
fruit
+2  A: 

Welcome to StackOverflow. Since you want to learn yourself (+1) I'll just give you a few pointers.

Check out os.walk() to get at all the files.

Then iterate over each line in the files (for line in currentfile: comes in handy here).

Now you need to know if you want a "stupid" replace (find/replace each foo even if it's in the middle of a word (say foobar - do you want foofoobar as a result?) or a smart replace.

For the former, look at str.replace(), for the latter, look at re.sub() and figure out what r'\bfoo\b' means.

Tim Pietzcker
Very cool, thanks! Learning about new functions (os.walk()) is always good. Does it traverse subdirectories, as well? I'm assuming your link will tell me.
fruit
Yes it does, and yes it does :)
Tim Pietzcker
A: 

this is an alternative, since you have various Python solutions presented to you. The most useful utility (according to me), in Unix/Windows, is the GNU find command and replacement tools like sed/awk. to search for files (recursively) and do replacement, a simple command like this does the trick (syntax comes from memory and not tested). this says find all text files and change the word "old" to "new" in their contents, at the same time, use sed to backup the original files...

$ find /path -type f -iname "*.txt" -exec sed -i.bak 's/old/new/g' "{}" +;
ghostdog74