I'd like to run a program on a directory of files. I know how to do this with one file, using
cat myFile.xml | myProgram.py
.
How can I run myProgram.py over a folder, say myFolder?
Thanks!
I'd like to run a program on a directory of files. I know how to do this with one file, using
cat myFile.xml | myProgram.py
.
How can I run myProgram.py over a folder, say myFolder?
Thanks!
Assuming your program can accept a filename as its first command line argument, one way is to use find
to find all the files in the folder, and then use xargs
to run your program for each of them:
find myFolder | xargs -n 1 myProgram.py
The -n 1
means "run the program once per file". If your program is happy to receive multiple filenames on its command line, you can omit the -n 1
and xargs
will run your program fewer times with multiple files on its command line.
(find
will do a recursive search, so you'll get all the files in and under myFolder. You can use find myFolder -maxdepth 1
to prevent that.)
(Thanks to @Personman for pointing out that this will run the program for the folder itself as well as the files. You can use find myFolder -type f
to tell find
to only return regular files.)
Or cat *.xml | myProgram.py
that will produce the output of every .xml file to stdin then piped to your program. This combines all files into one stream.
myProgram.py *.xml
will expand every filename as input to your program like this: myProgram.py file1.xml file2.xml file3.xml ... filen.xml
Each file remains separate and the script can tell one from another.
Python / Perl / sh scripts, base case, usually handle that the same as myProgram.py file1.xml; myProgram.py file2.xml; myProgram.py filen.xml
with the ;
meaning new command.
Play with it and welcome to Unix!