views:

37

answers:

1

I have a for loop that I pass through ssh to a server that is formatted to look kind of like this...

i=0
for cmd in cmd_list;
do ${cmd} | sed "s/^/OUTPUT_${i}_: /" &
(( i++ ));
done;
wait

The idea is that the for loop will run a list of commands that I give it and pipe each one to sed where sed prepends each line of output with a command number (0, 1, 2, 3, etc...). Then it backgrounds so as to allow parallel execution. This allows me to keep track of which command the output was associated with since the data could come back simultaneously and all mixed up. This works really well. Depending on the timing of when the commands print out information and when they are complete, the output might look something like this...

OUTPUT_0_: some_data_string_from_a_command
OUTPUT_2_: some_data_string_from_a_command
OUTPUT_0_: some_data_string_from_a_command
OUTPUT_3_: some_data_string_from_a_command
OUTPUT_1_: some_data_string_from_a_command
OUTPUT_1_: some_data_string_from_a_command

However, what I really want to do is this...

do ${cmd} 2>&1 | sed "s/^/OUTPUT_${i}_${PIPESTATUS[0]}: /" &

So I can get back this...

OUTPUT_0_0: some_data_string_from_a_command
OUTPUT_2_1: some_error_message_from_a_command
OUTPUT_0_0: some_data_string_from_a_command
OUTPUT_3_1: some_error_message_from_a_command
OUTPUT_1_0: some_data_string_from_a_command
OUTPUT_1_0: some_data_string_from_a_command

This works fine on the first command if it errors out. I will usually get the non-zero exit code from ${PIPESTATUS[0]}. However, when I have purposefully planted commands further along in the list that I know will fail (i.e. cat /tmp/some_non_existent_file), PIPESTATUS does not give me the proper exit code of the command in the pipe chain. I will sometimes get 0 instead of whatever the real exit code is.

Any idea why this is?

+1  A: 

Commands in a pipeline are executed in parallel. This means that they may not have exited by the time you evaluate PIPESTATUS, particularly since PIPESTATUS is expanded by the shell before the sed command is actually run.

I don't really understand what it is you are trying to do, because you appear to want to violate causality. Have you thought about it carefully? You seem to want the exit value of a command with its output, but if it is outputting data it clearly hasn't exited. You are relying on a small timing window always being open. It may be for some short-run-time commands, but that is even questionable since coordinating parallel execution without synchronisation operations is prone to "random" failure.

If you want to capture the data along with the output status of a command, you'll need to save the output away somewhere (in a file for instance), and when the process exits, then output the data with the exit status.

camh
That is what I was missing... commands in a pipeline being executed in parallel. Okay, that's sufficient. I was trying to prevent disk I/O, but like you mention... it looks like the best way to go is to use a file...
Jonathan