tags:

views:

87

answers:

3

My data as /tmp/1

9367543
9105616
9108177
8948074
8860323
9170406
9105616

I run and I get nothing

cat /tmp/1 | uniq -d

This is strange, since uniq -d should

-d      Only output lines that are repeated in the input.

How can you use uniq -d?

+1  A: 

Try this to double check, it will output any lines which are duplicated:

  cat /tmp/1 |  awk 'seen[$0]++ == 1'

Oh, this is your problem:

 cat /tmp/1 | sort | uniq -d

Sort it before running uniq!

Sean A.O. Harney
no need to use cat.
ghostdog74
Lines 2 and 7 of Masi's sample file are the same. But they're not on consecutive lines, which appears to be the heart of the misunderstanding.
dave
ghostdog, well I am using cat because the OP did also. Yes I am aware I could use shell redirection instead, or give as a command line arg to awk or sort.dave, thanks. Didn't see that one! edited.
Sean A.O. Harney
+3  A: 

You have to sort your data before you use uniq. It only removes/detects duplicates on adjacent lines.

dave
Or use an awk script to do the job properly?
Douglas Leeder
Thank you for pointing that out! --- It even says in the manual `The uniq utility reads the specified input_file comparing adjacent lines - -.`
Masi
With my GNU coreutils uniq the manual says: Discard all but one of successive identical lines from INPUT (or standard input), writing to OUTPUT (or standard output).
Sean A.O. Harney
A: 
awk '{_[$0]++}END{for(i in _)if(_[i]>1) print i}' /tmp/1

or just

awk '_[$0]++ == 1' file
ghostdog74
awk '_[$0]++' only works if there is at most one duplicate for each line with duplicates. If you had three rows that were the same, it would print out twice.
Sean A.O. Harney