ansaurus

Question

In a shell (bash) How can I execute more than one command in a pipeline?

Answer 1

A:

I think the bigger question is: what are you expecting the output to be?

If you're trying to do two things, then do two things:

awk '$13 ~ /type/ {print $15}' filename.txt > tempfile
wc -l < tempfile
sort -u < tempfile
rm tempfile

Stargazer712 2010-08-03 15:32:10

I expect my output to be two or more lines, one line with one result (a line count) and another (or more) with a sort of the results of the awk command.I was wondering if it was possible to even do this on one line because I would have to repeat the awk command or assign its result to a variable instead.

Lex 2010-08-03 15:34:41

See my edit to see how to do it without repeating awk. In general, even if you find a way to do what you are trying to do, don't do it. You should never intentionally write confusing code.

Stargazer712 2010-08-03 15:36:15

I didn't find it confusing, I just thought it was more efficient doing it this way.

Lex 2010-08-03 15:40:41

But you did have to ask on StackOverflow.com how to do it. Clarity is always superior to obscurity.

Stargazer712 2010-08-03 15:42:23

Oh, sorry, I didn't realize that, you're right, next time I will for sure. :)

Lex 2010-08-03 15:47:26

'cat tempfile > sort -u' is equivalent to 'cat tempfile -u > sort'. You probably meant to use '|', in which case you have UUOC. Just do 'sort -u < tempfile'.

William Pursell 2010-08-03 16:34:10

@William Pursell: Both good points. Edited answer as suggested.

Stargazer712 2010-08-03 16:49:07

@Stargazer712: there's no reason to use a temporary file unless the output it too big to fit in memory, that's what `$(...)` is for. Also creating a temporary file is a nontrivial problem: what if the current directory is not writable? What if two instances of the script are running in parallel? etc.

Gilles 2010-08-03 22:41:25

@Giles: In that case, I would write a custom script or program to perform the actions he is attempting to perform. Millions of man hours are wasted every year trying to decipher confusing code. Clarity is **always** superior to obscurity.

Stargazer712 2010-08-04 05:40:01

Answer 2

A:

You want to use named pipes created with mkfifo in combination with tee. An example is at http://www.softpanorama.org/Tools/tee.shtml

Greg Reynolds 2010-08-03 15:33:24

Answer 3

+4 A:

If you want to send output to two different commands in a single line, you'll need to do process substituion.

Try this:

awk '$13 ~ /type/ {print $15}' filename.txt | tee >(wc -l >&2) | sort -u

This outputs the line count on stderr and the sorted output on stdout. If you need the line count on stdout, you can do that leave off the >&2, but then it will be passed to the sort call and (most likely) sorted to the top of the output.

EDIT: corrected description of what happens based on further testing.

Walter Mundt 2010-08-03 15:35:37

It doesn't work :/

Lex 2010-08-03 15:43:29

@Lex: if bash is invoked as `sh`, it will go into compatibility mode, and process substitution won't work. explicitly call the shell as `bash` and it should be fine.

goldPseudo 2010-08-03 15:48:00

This works but it's not so neat:awk '$13 ~ /type/ {print $15}' filename.txt | tee test.txt | sort -u ; cat test.txt | wc -l

Lex 2010-08-03 15:48:17

Anyway voting this as accepted because it answers more precisely my question.

Lex 2010-08-03 15:51:14

It works for me, but be aware that this is a bash-specific feature. It will certainly not work on sh, or on bash-invoked-as-sh, and I don't know which other shells have it (e.g. ksh, zsh, etc.). It also doesn't work with busybox's shell, which is used on many installers and embedded devices masquerading as bash.

Walter Mundt 2010-08-03 15:56:08

@Walter: it works in ksh93, bash and zsh, but not in more basic shells such as pdksh, ash or busybox.

Gilles 2010-08-03 22:44:06

Answer 4

+3 A:

in that case, do your counting in awk , why the need for pipes? don't make it more complicated

awk '$13 ~ /type/ {print $15;c++}END{print c} ' filename.txt | sort -u

ghostdog74 2010-08-03 15:50:56

That's another way to solve it, thanks!

Lex 2010-08-03 15:52:33

Answer 5

+1 A:

If the size in the output is not too large to fit in memory and you don't need the wc and sort commands to work in parallel for performance reasons, here's a relatively simple solution:

output=$(awk '$13 ~ /type/ {print $15}' filename.txt; echo a)
printf "%s" "${output%a}" | sort -u
printf "%s" "${output%a}" | wc -l

That complication with the the extra a is in case the awk command might print some empty lines at the end of the input, which the $() construct would strip. You can easily choose which of sort or wc should appear first.

Here's a way that works with any POSIX shell (ash, bash, ksh, zsh, ...) but only on systems that have /dev/fd (which includes reasonably recent Linux, *BSD and Solaris). Like Walter's similar construction using the simpler method available in bash, ksh93 and zsh, the output of wc and the output of sort may be intermixed.

{
  awk '$13 ~ /type/ {print $15}' filename.txt |
  tee /dev/fd3 |
  wc -l
} 3>&1 1>&3 | sort -u

If you both need to deal with intermediate output that doesn't comfortably fit in memory and don't want to have the output of the two commands intermixed, I don't think there's an easy way in a POSIX shell, though it should be doable with ksh or zsh coprocesses.

Gilles 2010-08-03 22:39:17

ansaurus

tags:

views:

answers:

In a shell (bash) How can I execute more than one command in a pipeline?

related questions