tags:

views:

12545

answers:

9

I have a bash shell script that loops through all child directories (but not files) of a certain directory. The problem is that some of the directory names contain spaces.

Here are the contents of my test directory:

$ls -F test
Baltimore/  Cherry Hill/  Edison/  New York City/  Philadelphia/  cities.txt

And the code that loops through the directories:

for f in `find test/* -type d`; do
  echo $f
done

Here's the output:

test/Baltimore
test/Cherry
Hill
test/Edison 
test/New
York
City
test/Philadelphia

Cherry Hill and New York City are treated as 2 or 3 separate entries.

I tried quoting the filenames, like so:

for f in `find test/* -type d | sed -e 's/^/\"/' | sed -e 's/$/\"/'`; do
  echo $f
done

but to no avail.

There's got to be a simple way to do this. Any ideas?


The answers below are great. But to make this more complicated - I don't always want to use the directories listed in my test directory. Sometimes I want to pass in the directory names as command-line parameters instead.

I took Charles' suggestion of setting the IFS and came up with the following:

dirlist="${@}"
(
  [[ -z "$dirlist" ]] && dirlist=`find test -mindepth 1 -type d` && IFS=$'\n'
  for d in $dirlist; do
    echo $d
  done
)

and this works just fine unless there are spaces in the command line arguments (even if those arguments are quoted). For example, calling the script like this: test.sh "Cherry Hill" "New York City" produces the following output:

Cherry
Hill
New
York
City

Again, I know there must be a way to do this - I just don't know what it is...

+13  A: 

First, don't do it that way. The best approach is to use find -exec properly:

find test -type d -exec echo '{}' +

The next best approach is to use an IFS variable which doesn't contain the space character:

(
 IFS=$'\n'
 for N in $(find test -mindepth 1 -type d); do
   echo "$N"
 done
)

Finally, for the command-line parameter case, you should be using arrays.

for d in "$@"; do
  echo "$d"
done

will maintain separation. Note that the quoting (and the use of $@ rather than $*) is important. Arrays can be populated in other ways as well, such as glob expressions:

entries=( test/* )
for d in "${entries[@]}"; do
  echo "$d"
done
Charles Duffy
didn't know about that '+' flavor for -exec. sweet
Johannes Schaub - litb
tho looks like it can also, like xargs, only put the arguments at the end of the given command :/ that's bugged me sometimes
Johannes Schaub - litb
I think -exec [name] {} + is a GNU and 4.4-BSD extension. (At least, it doesn't appear on Solaris 8, and I don't think it was in AIX 4.3 either.) I guess the rest of us may be stuck with piping to xargs...
crosstalk
I've never seen the $'\n' syntax before. How does that work? (I would have thought that either IFS='\n' or IFS="\n" would work, but neither does.)
MCS
+3  A: 
find . -type d | while read file; do echo $file; done

However, doesn't work if the file-name contains newlines. The above is the only solution i know of when you actually want to have the directory name in a variable. If you just want to execute some command, use xargs.

find . -type d -print0 | xargs -0 echo 'The directory is: '
Johannes Schaub - litb
No need for xargs, see find -exec ... {} +
Charles Duffy
@Charles: for large numbers of files, xargs is much more efficient: it only spawns one process. The -exec option forks a new process for each file, which can be an order of magnitude slower.
Adam Rosenfield
I like xargs more. These two essentially seem to do the same both, while xargs has more options, like running in parallel
Johannes Schaub - litb
Adam, no that '+' one will aggregate as many filenames as possible and then executes. but it will not have such neat functions as running in parallel :)
Johannes Schaub - litb
hands down, who will have \n in their dirnames anyway :p stone them ^^
Johannes Schaub - litb
@litb - people trying to take advantage of security flaws in your code, for one. Assuming folks will do the sane or reasonable thing is dangerous.
Charles Duffy
+2  A: 

This is exceedingly tricky in standard Unix, and most solutions run foul of newlines or some other character. However, if you are using the GNU tool set, then you can exploit the find option -print0 and use xargs with the corresponding option -0 (minus-zero). There are two characters that cannot appear in a simple filename; those are slash and NUL '\0'. Obviously, slash appears in pathnames, so the GNU solution of using a NUL '\0' to mark the end of the name is ingenious and fool-proof.

Jonathan Leffler
+2  A: 

To add to what Jonathan said: use the -print0 option for find in conjunction with xargs as follows:

find test/* -type d -print0 | xargs -0 command

That will execute the command command with the proper arguments; directories with spaces in them will be properly quoted (i.e. they'll be passed in as one argument).

Adam Rosenfield
+2  A: 

Don't store lists as strings; store them as arrays to avoid all this delimiter confusion. Here's an example script that'll either operate on all subdirectories of test, or the list supplied on its command line:

#!/bin/sh
if [ $# -eq 0 ]; then
        # if no args supplies, build a list of subdirs of test/
        dirlist=() # start with empty list
        for f in test/*; do # for each item in test/ ...
                if [ -d "$f" ]; then # if it's a subdir...
                        dirlist=("${dirlist[@]}" "$f") # add it to the list
                fi
        done
else
        # if args were supplied, copy the list of args into dirlist
        dirlist=("$@")
fi
# now loop through dirlist, operating on each one
for dir in "${dirlist[@]}"; do
        printf "Directory: %s\n" "$dir"
done

Now let's try this out on a test directory with a curve or two thrown in:

$ ls -F test
Baltimore/
Cherry Hill/
Edison/
New York City/
Philadelphia/
this is a dirname with quotes, lfs, escapes: "\''?'?\e\n\d/
this is a file, not a directory
$ ./test.sh 
Directory: test/Baltimore
Directory: test/Cherry Hill
Directory: test/Edison
Directory: test/New York City
Directory: test/Philadelphia
Directory: test/this is a dirname with quotes, lfs, escapes: "\''
'
\e\n\d
$ ./test.sh "Cherry Hill" "New York City"
Directory: Cherry Hill
Directory: New York City
Gordon Davisson
A: 

just found out there are some similarities between my question and yours. Aparrently if you want to pass arguments into commands

test.sh "Cherry Hill" "New York City"

to print them out in order

for SOME_ARG in "$@"
do
    echo "$SOME_ARG";
done;

notice the $@ is surrounded by double quotes, some notes here

Jeffrey04
A: 

Had to be dealing with whitespaces in pathnames, too. What I finally did was using a recursion and for item in /path/*:

function recursedir {
        path=${1%/}
        local item
        for item in $path/*
        do
                if [ -d "$item" ]
                then
                        recursedir "$item"
                else
                        command
                fi
        done

}
+1  A: 

Here is a simple solution which handles the whitespaces.

The test directory

$ls -F test
Baltimore/  Cherry Hill/  Edison/  New York City/  Philadelphia/  cities.txt

The code to go into the directories

find test -type d | while read f ; do
  echo $f
done

And the output:

test/Baltimore
test/Cherry Hill
test/Edison
test/New York City
test/Philadelphia
cbliard
A: 

Just had a simple variant problem... Convert files of typed .flv to .mp3 (yawn).

for file in read `find . *.flv`; do ffmpeg -i ${file} -acodec copy ${file}.mp3;done

recursively find all the Macintosh user flash files and turn them into audio (copy, no transcode) ... it's like the while above, noting that read instead of just 'for file in ' will escape.

mark washeim