views:

818

answers:

7

Is there an idiomatic way to simulate Perl's diamond operator in bash? With the diamond operator,

script.sh | ...

reads stdin for its input and

script.sh file1 file2 | ...

reads file1 and file2 for its input.

One other constraint is that I want to use the stdin in script.sh for something else other than input to my own script. The below code does what I want for the file1 file2 ... case above, but not for data provided on stdin.

command - $@ <<EOF
some_code_for_first_argument_of_command_here
EOF

I'd prefer a Bash solution but any Unix shell is OK.

Edit: for clarification, here is the content of script.sh:

#!/bin/bash
command - $@ <<EOF
some_code_for_first_argument_of_command_here
EOF

I want this to work the way the diamond operator would work in Perl, but it only handles filenames-as-arguments right now.

Edit 2: I can't do anything that goes

cat XXX | command

because the stdin for command is not the user's data. The stdin for command is my data in the here-doc. I would like the user data to come in on the stdin of my script, but it can't be the stdin of the call to command inside my script.

+4  A: 

Kind of cheezy, but how about

cat file1 file2 | script.sh
Paul Tomblin
How is this cheezy? This is exactly what cat is for.
sigjuice
Thanks for the answer. The trouble is I don't want users of script.sh to need to do that. :)
Steven Huwig
Additionally I think it wouldn't work in my case, because I need stdin for this constant code in the here-doc.
Steven Huwig
+1  A: 

The Perl diamond operator essentially loops across all the command line arguments, treating each as a filename. It opens each file and reads them line-by-line. Here's some bash code that will do approximately the same.

for f in "$@"
do
   # Do something with $f, such as...
   cat $f | command1 | command2
   -or-
   command1 < $f
   -or-
   # Read $f line-by-line
   cat $f | while read line_from_f
   do
      # Do stuff with $line_from_f
   done
done
Barry Brown
I believe the OP already has code that does this part. Perl's diamond operator will also read in standard input line-by-line if no command-line arguments are given, which (I think) is what the OP is asking about.
Chris Lutz
"$*" is wrong, it is a single string even if there are multiple arguments. You mean "$@" instead.
ephemient
You're right. I can never remember the difference between $* and $@.
Barry Brown
note that piping to a read won't work right in pdksh, for whatever that's worth in a question about bash... :)
dannysauer
A: 

Also a little cheezy, but how about this:

if [[ $# -eq 0 ]]
then
  # read from stdin
else
  # read from $* (args)
fi

If you need to read and process line-by-line (which is likely) and don't want to copy/paste the same code twice (which is likely), define a function in your script and just pass the lines one-by-one to this function, and process them in said function.

Chris Lutz
+6  A: 

Sure, this is totally doable:

#!/bin/bash
cat $@ | some_command_goes_here

Users can then call your script with no arguments (or '-') to read from stdin, or multiple files, all of which will be read.

If you want to process the contents of those files (say, line-by-line), you could do something like this:

for line in $(cat $@); do
    echo "I read: $line"
done

Edit: Changed $* to $@ to handle spaces in filenames, thanks to a helpful comment.

Don Werve
Excellent use of cat to concatenate multiple files or stdin. I always forget about our wonderful cat-as-a-basic-text-editor trick
Chris Lutz
@Chris Lutz: Yeah, cat is all powerful, and it has a funny cousin (tac) that reads lines in reverse order, too. :)
Don Werve
This doesn't correctly handle files with spaces in their names.
Chas. Owens
@Chas: I'm not sure if there is a good way to handle that in Bash; all of the built-in glob operators seem to break in that case.
Don Werve
Use "$@", which will quote all the arguments as separate strings.
ephemient
@ephemient: I have been searching for that for a long, long time. Thanks!
Don Werve
the stdin for command *is not the user's data*. The stdin for command is my data in the here-doc. I would like the user data to come in on the stdin of my script, but it can't be the stdin of the call to command inside my script.
Steven Huwig
@Steven Huwig - that is not how Perl's diamond operator works. That's more like Perl reading from <DATA>.
Chris Lutz
for line in $(cat $@)has a couple of limitations. First, the combined length of the files must be smaller than the maximum command-line length. Second, you should quote $@ (as in "$@") so files containing any elements of $IFS (such as spaces) are respected.
dannysauer
A: 

Why not use `cat @* ` in the script? For example:

x=`cat $*`
echo $x
TrayMan
+1  A: 

You want to take the first argument and do something with it, and then either read from any files specified or stdin if no files?

Personally, I'd suggest using getopt to indicate arguments using the "-a value" syntax to help disambiguate, but that's just me. Here's how I'd do it in bash without getopts:

firstarg=${1?:usage: $0 arg [file1 .. fileN]}
shift
typeset -a files
if [[ ${#@} -gt 0 ]]
then
  files=( "$@" )
else
  files=( "/dev/stdin" )
fi
for file in "${files[@]}"
do
  whatever_you_want < "$file"
done

The ?: operator will die if there are no args specified, since you seem to want at least one arg either way. After grabbing that, shift the args over by one, and then either use the remaining args as your file list, or the bash special filehandle "/dev/stdin" if there were no other args.

I think that the "if no files are specified, use /dev/stdin - otherwise use the files on the command line" piece is probably what you're looking for, but the rest of the code is at least useful for context.

dannysauer
This suffers from the same problem as the other suggestions -- "whatever_you_want" already has stuff coming into its STDIN via a here-doc.The more I think about this the more I think I need named pipes or some other way to come up with an extra file descriptor.
Steven Huwig
So, what do you actually want to do with the file / stdin, if not sending it to a program? Replace the contents of the "for file in" loop with whatever you want to do - read the first line with var=$(head -1 $file), combine them with cat $file >> /tmp/file; etc.
dannysauer
+1  A: 

I am (like everyone else, it seems) a bit confused about exactly what the goal is here, so I'll give three possible answers that may cover what you actually want. First, the relatively simple goal of getting the script to read from either a list of files (supplied on the command line) or from its regular stdin:

if [ $# -gt 0 ]; then
  exec < <(cat "$@")
fi
# From this point on, the script's stdin is redirected from the files
# (if any) supplied on the command line

Note: the double-quoted use of $@ is the best way to avoid problems with funny characters (e.g. spaces) in filenames -- $* and unquoted $@ both mess this up. The <() trick I'm using here is a bash-only feature; it fires off cat in the background to feed data from files supplied on the command line, and then we use exec to replace the script's stdin with the output from cat.

...but that doesn't seem to be what you actually want. What you seem to really want is to pass the supplied filenames or the script's stdin as arguments to a command inside the script. This requires sort of the opposite process: converting the script's stdin into a file (actually a named pipe) whose name can be passed to the command. Like this:

if [[ $# -gt 0 ]]; then
  command "$@" <<EOF
here-doc goes here
EOF
else
  command <(cat) <<EOF
here-doc goes here
EOF
fi

This uses <() to launder the script's stdin through cat to a named pipe, which is then passed to command as an argument. Meanwhile, command's stdin is taken from the here-doc.

Now, I think that's what you want to do, but it's not quite what you've asked for, which is to both redirect the script's stdin from the supplied files and pass stdin to the command inside the script. This can be done by combining the above techniques:

if [ $# -gt 0 ]; then
  exec < <(cat "$@")
fi

command <(cat) <<EOF
here-doc goes here
EOF

...although I can't think why you'd actually want to do this.

Gordon Davisson
Thanks for responding. I think #2 is what I want. The reason being that command will always be given the same data for the first filename argument, but the other filename arguments will be user data. I don't want to distribute two files (the script and the data for the first filename argument).
Steven Huwig
And I also want them to be able to use this as a standard shell pipeline component if they want, thus leading to the stdin quandary.
Steven Huwig