views:

1255

answers:

4

Probably a simple question but...

How does piping work? If I run a program via CLI and redirect output to a file will I be able to pipe that file into another program as it is being written?

Basically when one line is written to the file I would like it to be piped immediately to my second application (I am trying to dynamically draw a graph off an existing program). Just unsure if piping completes the first command before moving on to the next command.

Any feed back would be greatly appreciated!

+11  A: 

If you want to redirect the output of one program into the input of another, just use a simple pipeline:

program1 arg arg | program2 arg arg

If you want to save the output of program1 into a file and pipe it into program2, you can use tee(1):

program1 arg arg | tee output-file | program2 arg arg

All programs in a pipeline are run simultaneously. Most programs typically use blocking I/O: if when they try to read their input and nothing is there, they block: that is, they stop, and the operating system de-schedules them to run until more input becomes available (to avoid eating up the CPU). Similarly, if a program earlier in the pipeline is writing data faster than a later program can read it, eventually the pipe's buffer fills up and the writer blocks: the OS de-schedules it until the pipe's buffer gets emptied by the reader, and then it can continue writing again.


EDIT

If you want to use the output of program1 as the command-line parameters, you can use the backquotes or the $() syntax:

# Runs "program1 arg", and uses the output as the command-line arguments for
# program2
program2 `program1 arg`

# Same as above
program2 $(program1 arg)

The $() syntax should be preferred, since they are clearer, and they can be nested.

Adam Rosenfield
For some reason the initial program (program1 in this case) does not cooperate with piping. Is there special code needed inside of it? I tried to do a simple pipe as you had laid out and 1) CLI interface for program1 did not start, and 2) program2 asked for parameters (though I am trying to pass the output of 1 as the parameters for 2)
Great Post though =P very informative
@Jon, it sounds like program1 isn't sending its output to stdout or program2 isn't reading from stdin. What programs are you trying to run?
Matthew Crumley
There is a flag that some programs can read to find out if they are in a pipe or not... 'ls' does this to decide how to display file information (when in a pipe, it has each file on a line of its own etc.)
Ape-inago
@Matt, I used the example laid out in the man page for pipe and echo. I assumed this would take an input and then print it twice or at the very least once. Not so. My CLI looks like $./test "testing" | echo. Test is basically just a program that puts a string in a pipe and pulls it out character by character.
@Jon: echo doesn't read its stdin, it only looks at its command line parameters. What exactly are you trying to do?
Adam Rosenfield
@Adam: The test pipe I wrote above was just me trying to learn linux a little better. The original program has too many libraries based in linux to try to do anything with it on windows.The original program itself is a test program which takes a series of inputs to determine the appropriate output (which changes randomly and dynamically). The output travels though a network (or in my case to another terminal) where it is printed. My task is to write a program which takes that input and makes it "pretty" such as a graph.That the basic idea at least.
I apologize for the poor formatting =/
+3  A: 

If your programs are communicating using stdin and stdout, then make sure that you are either calling fflush(stdout) after you write or find some way to disable standard IO buffering. The best reference that I can think of that really describe how to best implement pipelines in C/C++ is Advanced Programming in the UNIX Environment or UNIX Network Programming: Volume 2. You could probably start with a this article as well.

D.Shawley
+1 for mentioning Stevens' stuff...awesome books.
Harper Shelby
+3  A: 

Piping does not complete the first command before running the second. Unix (and Linux) piping run all commands concurrently. A command will be suspended if

  • It is starved for input.

  • It has produced significantly more output than its successor is ready to consume.

For most programs output is buffered, which means that the OS accumulates a substantial amount of output (perhaps 8000 characters or so) before passing it on to the next stage of the pipeline. This buffering is used to avoid too much switching back and forth between processes and kernel.

If you want output on a pipeline to be sent right away, you can use unbuffered I/O, which in C means calling something like fflush() to be sure that any buffered output is immediately sent on to the next process. Unbuffered input is also possible but is generally unnecessary because a process that is starved for input typically does not wait for a full buffer but will process any input you can get.

For typical applications unbuffered output is not recommended; you generally get the best performance with the defaults. In your case, however, where you want to do dynamic graphing immediately the first process has the info available, you definitely want to be using unbuffered output. If you're using C, calling fflush(stdout) whenever you want output sent will be sufficient.

Norman Ramsey
A: 

If your two programs insist on reading and writing to files and do not use stdin/stdout, you may find you can use a named pipe instead of a file.

Create a named pipe with the mknod(1) command:

$ mknod /tmp/named-pipe p

Then configure your programs to read and write to /tmp/named-pipe (use whatever path/name you feel is appropriate).

In this case, both programs will run in parallel, blocking as necessary when the pipe becomes full/empty as described in the other answers.

camh