views:

1228

answers:

5

I want to write a program for this: in a folder I have n number of files; first read one file and perform some operation then store result in a separate file and the read 2nd file again perform operation and save result in new 2nd file, even same procedure n number of files. The program read all files one by one and stores results of each file separately in python. Please give examples how I will do it. Thanks.

+4  A: 
import sys

# argv is your commandline arguments, argv[0] is your program name, so skip it
for n in sys.argv[1:]:
    print(n) #print out the filename we are currently processing
    input = open(n, "r")
    output = open(n + ".out", "w")
    # do some processing
    input.close()
    output.close()

Then call it like:

./foo.py bar.txt baz.txt
Matthew Scharley
print(n) to make it python 3 compatible. Still works with 2.4 at least, as well.
Bernard
I'm running 2.3 from memory, but thanks for the heads up.
Matthew Scharley
it may be overkill for a single argument, but I'd recommend the optparse module for command line parsing. Takes care of ugly tasks like handling quotes, etc.
monkut
+2  A: 

You may find the fileinput module useful. It is designed for exactly this problem.

fivebells
A: 

I think what you miss is how to retrieve all the files in that directory. To do so, use the glob module. Here is an example which will duplicate all the files with extension *.txt to files with extension *.out

import glob

list_of_files = glob.glob('./*.txt')           # create the list of file
for file_name in list_of_files:
  FI = open(file_name, 'r')
  FO = open(file_name.replace('txt', 'out'), 'w') 
  for line in FI:
    FO.write(line)

  FI.close()
  FO.close()
Mapad
One small issue with this example. What happens if I have a file called 'mytxtfile.txt'?
Matthew Scharley
A: 

Combined answer incorporating directory or specific list of filenames arguments:

import sys
import os.path
import glob

def processFile(filename):
    fileHandle = open(filename, "r")
    for line in fileHandle:
        # do some processing
        pass
    fileHandle.close()

def outputResults(filename):
    output_filemask = "out"
    fileHandle = open("%s.%s" % (filename, output_filemask), "w")
    # do some processing
    fileHandle.write('processed\n')
    fileHandle.close()

def processFiles(args):
    input_filemask = "log"
    directory = args[1]
    if os.path.isdir(directory):
        print "processing a directory"
        list_of_files = glob.glob('%s/*.%s' % (directory, input_filemask))
    else:
        print "processing a list of files"
        list_of_files = sys.argv[1:]

    for file_name in list_of_files:
        print file_name
        processFile(file_name)
        outputResults(file_name)

if __name__ == '__main__':
    if (len(sys.argv) > 1):
        processFiles(sys.argv)
    else:
        print 'usage message'
michaeljoseph
A: 

I've just learned of the os.walk() command recently, and it may help you here. It allows you to walk down a directory tree structure.

import os
OUTPUT_DIR = 'C:\\RESULTS'
for path, dirs, files in os.walk('.'):
    for file in files:
        read_f = open(os.join(path,file),'r')
        write_f = open(os.path.join(OUTPUT_DIR,file))

        # Do stuff
monkut