views:

183

answers:

2

In a post's answer on superuser.com, we see that

join <(sort abc) <(sort bcd)

will sort files abc and bcd before sending them to join. This leads to a programming question, better suited for stackoverflow.

How does this work? What exactly is this <() construct? What's it called?

If (sort abc) is a legal call that runs sort on abc and returns output, why do we need the <?

That is, the following two lines are equivalent

(sort abc) | join - <(sort bcd)
join <(sort abc) <(sort bcd)

but

join (sort abc) (sort bcd)

is a syntax error. Please clue me in!

+14  A: 

This is called process substitution.

<(list) is a single syntax construct, the '<' character is not a separate symbol in this case. It executes list and provides its output as sort of a file (not a standard redirection) to the command.

It is equivalent to running (except it uses pipes instead of temporary files when possible):

sort abc > /tmp/1
sort bcd > /tmp/2
join /tmp/1 /tmp/2

Note that the output of both sorts are provided as filenames to join, not as standard redirections.

(list) is a different construct, for a different purpose. It simply creates a subshell that executes list, providing its standard descriptors to the parent shell.

Here is the relevant part in the bash manual.

Juliano
+4  A: 

<(command) is a Process Substitution (see the according section in man bash). Basically command is run and its output is fed to a named pipe (FIFO), and the whole construct is then replaced by the name of the pipe, thus resulting in join /dev/fd/x /dev/fd/y.

Bombe