tags:

views:

42

answers:

5

Have documents stored in a file system which includes "daily" directories, e.g. 20050610. In a bash script I want to list the files in a months worth of these directories. So I'm running a find command find <path>/200506* -type f >> jun2005.lst. Would like to check that this set of directories is not a null set before executing the find command. However, if I use if[ -d 200506* ] I get a "too many arguements error. How can I get around this?

A: 
S=200506*

if [ ${#S} -gt 6 ]; then
    echo haz filez!
else
    echo no filez
fi

not a very elegant one, but w/o any external tools/commands (if don't think of "[" as an external one)

the clue is if there is some files matched, then "S" variable will contain their names delimited with space. Otherwise it will contain a "200506*" string itself.

zed_0xff
A: 

You could us ls like this:

  if [ -n "$(ls -d | grep 200506)" ]; then
        # There are directories with this pattern
  fi
Bobby
Or maybe there are regular files with that pattern.
nc3b
@nc3b: Good call, `-d` should help, I think...
Bobby
`-d` doesn't restrict `ls` to only list directories it just prevents it from listing their contents so `ls -d` lists nothing at all. You probably mean `ls -d */`
Dennis Williamson
+1  A: 

Because there is a limit on command line length in most shells: anything like "$(ls -d | grep 200506)" or /path/200506* will run the risk of overflowing the limit. I'm not sure if substitutions and glob expansions count towards it in BASH, but I assume so. You would have to test it and check the bash docs and source to be sure.

The answer is in simplifying your question.

find <path>/200506* -type f -exec somescript '{}' \;

Where somescript is a shell script that does the test. Something like this perhaps:

#!/bin/sh
[ -d "$@" ] && echo "$@" >> june2005.lst

Passing the june2005.lst to the script (advice: use an environment variable), and dealing with any possibility that 200506* may expand to tooo huge a file path, being left as an exercise for the OP ;)

Integrating the whole thing into a pipe line or adapting a more general scripting language would yield performance boosts, by minimizing the number of shells spawned. Now that would be fun. Here is a hint for that, use -exec and another program (awk, perl, etc) to do the directory test as part of a one line filter, and keep the >>june2005.lst on the find command.

TerryP
+1  A: 

Your "too many arguments" error does not come from there being a huge number of files and exceeding the command line argument limit. It comes from having more than one or two directories that match the glob. Your glob "200506*" expands to something like "20050601 20050602 20050603..." and the -d test only expects one argument.

$ mkdir test
$ cd test
$ mkdir a1
$ [ -d a* ]    # no error
$ mkdir a2
$ [ -d a* ]
-bash: [: a1: binary operator expected
$ mkdir a3
$ [ -d a* ]
-bash: [: too many arguments

The answer by zed_0xff is on the right track, but I'd use a different approach:

shopt -s nullglob
path='/path/to/dirs'
glob='200506*/'
outfile='jun2005.lst'
dirs=("$path"/$glob)  # dirs is an array available to be iterated over if needed
if (( ${#dirs[@]} > 0 ))
then
    echo "directories found"
    # append may not be necessary here 
    find "$path"/$glob -type f >> "$outfile"
fi

The position of the quotes in "$path"/$glob versus "$path/$glob" is essential to this working.

Edit:

Corrections made to exclude files that match the glob (so only directories are included) and to handle the very unusual case of a directory named literally like the glob ("200506*").

Dennis Williamson
Noted that -d expects one arguement. As the directories being tested a"daily" the can be up to 31 in a month. I several different "kinds" of documents and run the find commands sequentially. find <path>/a/200506* >> outfile find <path>/b/200506* >> outfile find <path>/c/200506* >> outfileThis means that if one in the middle fails I'm assuming that following ones are not executed and the whole script fails. Is this correct?Dennis' solution will work in the the context of my script. FYI, No files match the glob in name and no directory exactly matches the glob.
Jim Jones
@Jim Jones: I'm not sure I follow you. What you can do is something like: `globs=('200506*/' '200507*/' '200508*'); outfiles=("jun2005" "jul2005" "aug2005"); subdirs=(a b c); for glob in ${globs[@]}; do outfilebase=${outfiles[i++]}; for subdir in ${subdirs[@]}; do path="/top/of/path/$dir"; outfile="$outfilebase.$subdir.lst"; dirs=... [as above] ... if ... find ... else echo "No directories found for $glob under $subdir"; fi` [untested]
Dennis Williamson
A: 
prefix="/tmp/path"
glob="200611*"
n_dirs=$(find $prefix -maxdepth 1 -type d -wholename "$prefix/$glob" |wc -l)
if [[ $n_dirs -gt 0 ]];then 
   find $prefix -maxdepth 2 -type f -wholename "$prefix/$glob" 
fi
Jürgen Hötzel