views:

1715

answers:

4

What is the best way to choose a random file from a directory in a shell script?

Here is my solution in Bash but I would be very interested for a more portable (non-GNU) version for use on Unix proper.

dir='some/directory'
file=`/bin/ls -1 "$dir" | sort --random-sort | head -1`
path=`readlink --canonicalize "$dir/$file"` # Converts to full path
echo "The randomly-selected file is: $path"

Anybody have any other ideas?

Edit: lhunath makes a good point about parsing ls. I guess it comes down to whether you want to be portable or not. If you have the GNU findutils and coreutils then you can do:

find "$dir" -maxdepth 1 -mindepth 1 -type f -print0 \
  | sort --zero-terminated --random-sort \
  | sed 's/\d000.*//g/'

Whew, that was fun! Also it matches my question better since I said "random file". Honsetly though, these days it's hard to imagine a Unix system deployed out there having GNU installed but not Perl 5.

+2  A: 

Something lile"

let x="$RANDOM % ${#file}"
echo "The randomly-selected file is ${path[$x]}"

$RANDOM in bash is a special variable that returns a random number, then use modulus division to get a valid index, then index into the array.

MGoDave
Poster want's a solution with no Bash-isms.
ashawley
I guess next time I'll read the whole question
MGoDave
@MGoDave don't feel too bad. I am always interested in a good Bash solution and a good GNU-free solution, for different situations and as a mental exercise.
jhs
+1  A: 

This boils down to: How can I create a random number in a Unix script in a portable way?

Because if you have a random number between 1 and N, you can use head -$N | tail to cut somewhere in the middle. Unfortunately, I know no portable way to do this with the shell alone. If you have Python or Perl, you can easily use their random support but AFAIK, there is no standard rand(1) command.

Aaron Digulla
That's a good point. Is `ls -1` standard on Unix, or is that just GNU? Anyway yes the biggest problem is getting a random number. I would argue that Perl is pretty universal since it's been shipping standard since IIRC Solaris 2.6 and HP-UX 11i
jhs
-1 as an argument to ls is standard in SUS2 (http://opengroup.org/onlinepubs/007908799/xcu/ls.html). I don't know when it was added, but I believe it was available back in the POSIX days as well.
Chas. Owens
@Chas thanks for the link. Still, Aaron has a point that filenames with newlines could cause problems. So that could be relevent depending on whether and how you let "civilians" create files directly on the filesystem.
jhs
+3  A: 
files=(/my/dir/*)
printf "%s\n" "${files[RANDOM % ${#files}]}"

And don't parse ls. Read http://mywiki.wooledge.org/ParsingLs

Edit: Good luck finding a non-bash solution that's reliable. Most will break for certain types of filenames, such as filenames with spaces or newlines or dashes (it's pretty much impossible in pure sh). To do it right without bash, you'd need to fully migrate to awk/perl/python/... without piping that output for further processing or such.

lhunath
RANDOM and arrays are Bash features, and the OP is "interested [in] a more portable (non-GNU) version for use on Unix proper".
ashawley
Thanks @lhunath, The point about ls is well-taken. I updated the question.
jhs
+2  A: 

I think Awk is a good tool to get a random number. According to the Advanced Bash Guide, Awk is a good random number replacement for $RANDOM.

Here's a version of your script that avoids Bash-isms and GNU tools.

#! /bin/sh

dir='some/directory'
n_files=`/bin/ls -1 "$dir" | wc -l | cut -f1`
rand_num=`awk "BEGIN{srand();print int($n_files * rand()) + 1;}"`
file=`/bin/ls -1 "$dir" | sed -ne "${rand_num}p"`
path=`cd $dir && echo "$PWD/$file"` # Converts to full path.  
echo "The randomly-selected file is: $path"

It inherits the problems other answers have mentioned should files contain newlines.

ashawley
That's a great idea. You have to scan the directory twice and there is a race condition if the number of files changes in between scans, but in practice that's probably not a big deal.
jhs
Yeah, I'm convinced that traditional Bourne shell programming is fundamentally flawed for many situations regardless of one's best efforts. Enter Bash and GNU coreutils to save the day.
ashawley