What is the difference between the following to commands?
sort -u FILE
sort FILE | uniq
Thanks!
What is the difference between the following to commands?
sort -u FILE
sort FILE | uniq
Thanks!
sort -u
will be slightly faster, because it does not need to pipe the output between two commands
also see my question on the topic: calling uniq and sort in different orders in shell
Using 'sort -u' does less I/O than the 'sort | uniq', but the net result is the same. In particular, if the file is big enough that sort has to create intermediate files, there's a decent chance that 'sort -u' will use slightly fewer or slightly smaller intermediate files as it could eliminate duplicates as it is sorting each set. If the data is highly duplicative, this could be beneficial; if there are few duplicates in fact, it won't make much difference (definitely a second order performance effect, compared to the first order effect of the pipe).
Note that there times when the piping is appropriate. For example:
sort FILE | uniq -c | sort -n
This sorts the file into order of the number of occurrences of each line in the file, with the most repeated lines appearing last. (It wouldn't surprise me to find that this combination, which is idiomatic for Unix or POSIX, can be squished into one complex 'sort' command with GNU sort.)
There are times when not using the pipe is important. For example:
sort -u -o FILE FILE
This sorts the file 'in situ'; that is, the output file is specified by -o FILE
, and this operation is guaranteed safe (the file is read before being overwritten for output).
There is one slight difference: return code.
The thing is that unless shopt -o pipefail
is set the return code of the piped command will be return code of the last one. And uniq
always returns zero (success). Try examining exit code, and you'll see something like this (pipefail
is not set here):
pavel@lonely ~ $ sort -u file_that_doesnt_exist ; echo $?
sort: open failed: file_that_doesnt_exist: No such file or directory
2
pavel@lonely ~ $ sort file_that_doesnt_exist | uniq ; echo $?
sort: open failed: file_that_doesnt_exist: No such file or directory
0
Other than this, the commands are equivalent.
I have worked on some servers where sort don't support '-u' option. there we have to use
sort xyz | uniq