views:

1539

answers:

6

I have split a large text file into a number of sets of smaller ones for performance testing that i'm doing. There are a number of directories like this:

/home/brianly/output-02 (contains 2 files myfile.chunk.00 and myfile.chunk.01)
/home/brianly/output-04 (contains 4 files...)
/home/brianly/output-06 (contains 6 files...)

It's important to note that there is an increasing number of files in each directory. What I need to do is run an executable against each of the text files in the output directories. The command looks something like this against a single file:

./myexecutable -i /home/brianly/output-02/myfile.chunk.00 -o /home/brianly/output-02/myfile.chunk.00.processed

Here the -i parameter is the input file and -o parameter is the output location.

In C# I'd loop over the directories get the list of files in each folder, then loop over them to run the commandlines. How do I traverse a directory structure like this using bash, and execute the command with the correct parameters based on the location and files in that location?

A: 

That's what the find command is for.

http://linux.die.net/man/1/find

S.Lott
A: 

Something like:

for x in `find /home/brianonly -type f`
do
./yourexecutable -i $x -o $x.processed
done
nikudesu
there are back-ticks before find and after f
nikudesu
for x in $(find /home/brianonly -type f); do ... done$() is _so_ much more readable than back-ticks. My $.02.
Kevin Little
I think yours would try to process the processed files.
Dennis Williamson
Dennis, you're right. I am standing by my disclaimer of: something like :)
nikudesu
A: 

Use find and exec. Have a look at following

http://tldp.org/LDP/abs/html/moreadv.html

Methos
+1  A: 

As others have suggested, use find(1):

# Find all files named 'myfile.chunk.*' but NOT named 'myfile.chunk.*.processed'
# under the directory tree rooted at base-directory, and execute a command on
# them:
find base-directory -name 'output.*' '!' -name 'output.*.processed' -exec ./myexecutable -i '{}' -o '{}'.processed ';'
Adam Rosenfield
+2  A: 

For this kind of thing I always use find together with xargs:

$ find output-* -name "*.chunk.??" | xargs -I{} ./myexecutable -i {} -o {}.processed

Now since your script processes only one file at a time, using -exec (or -execdir) directly with find, as already suggested, is just as efficient, but I'm used to using xargs, as that's generally much more efficient when feeding a command operating on many arguments at once. Thus it's a very useful tool to keep in one's utility belt, so I thought it ought to be mentioned.

Lars Haugseth
+1  A: 

From the information provided, it sounds like this would be a completely straightforward translation of your C# idea.

for i in /home/brianly/output-*; do
    for j in "$i/"*.[0-9][0-9]; do
        ./myexecutable -i "$j" -o "$j.processed"
    done
done
ephemient