ansaurus

Question

How to write tag deleter script in python

Answer 1

+3 A:

The general solution would be to:

use the os.walk() function to traverse the directory tree.
Iterate over the filenames and use fn_name.endswith('.cpp') with if/elseif to determine which file you're working with
Use the re module to create a regular expression you can use to determine if a line contains your tag
Open the target file and a temporary file (use the tempfile module). Iterate over the source file line by line and output the filtered lines to your tempfile.
If any lines were replaced, use os.unlink() plus os.rename() to replace your original file

It's a trivial excercise for a Python adept but for someone new to the language, it'll probably take a few hours to get working. You probably couldn't ask for a better task to get introduced to the language though. Good Luck!

----- Update -----

The files attribute returned by os.walk is a list so you'll need to iterate over it as well. Also, the files attribute will only contain the base name of the file. You'll need to use the root value in conjunction with os.path.join() to convert this to a full path name. Try doing just this:

for root, d, files in os.walk('.'): 
    for base_filename in files: 
        full_name = os.path.join(root, base_filename)
        if full_name.endswith('.h'):
            print full_name, 'is a header!'
        elif full_name.endswith('.cpp'):
            print full_name, 'is a C++ source file!'

If you're using Python 3, the print statements will need to be function calls but the general idea remains the same.

Rakis 2010-10-04 16:07:38

Nice! Any interesting web you know with simple examples? Thanks :)

andofor 2010-10-05 06:37:13

Is it possible to do something like fn_name.endswith('.cpp,.h')? Where I do define fn_name?

andofor 2010-10-05 08:02:29

Updated to provide a bit more detail. Unfortunately, a quick google search didn't turn up any straight-forward examples. You'll have to play with this a bit to get it figured out. Learning the Python basics is well worth your time though, it'll pay off down the line.

Rakis 2010-10-05 12:44:25

Answer 2

+1 A:

Try something like this:

import os
import re

CPP_TAG_RE = re.compile(r'(?<=// *)\$[^$]+\$')

tag_REs = {
    '.h': CPP_TAG_RE,
    '.cpp': CPP_TAG_RE,
    '.xml': re.compile(r'(?<=<!-- *)\$[^$]+\$(?= *-->)'),
    '.txt': re.compile(r'(?<=# *)\$[^$]+\$'),
}

def process_file(filename, regex):
    # Set up.
    tempfilename = filename + '.tmp'
    infile = open(filename, 'r')
    outfile = open(tempfilename, 'w')

    # Filter the file.
    for line in infile:
        outfile.write(regex.sub("", line))

    # Clean up.
    infile.close()
    outfile.close()

    # Enable only one of the two following lines.
    os.rename(filename, filename + '.orig')
    #os.remove(filename)

    os.rename(tempfilename, filename)

def process_tree(starting_point=os.curdir):
    for root, d, files in os.walk(starting_point): 
        for filename in files:
            # Get rid of `.lower()` in the following if case matters.
            ext = os.path.splitext(filename)[1].lower()
            if ext in tag_REs:
                process_file(os.path.join(root, base_filename), tag_REs[ext])

Nice thing about os.splitext is that it does the right thing for filenames that start with a ..

Mike DeSimone 2010-10-05 13:07:52

ansaurus

tags:

views:

answers:

How to write tag deleter script in python

related questions