views:

63

answers:

2
+2  Q: 

Sorting in bash

I have been trying to get the unique values in each column of a tab delimited file in bash. So, I used the following command.

cut -f <column_number> <filename> | sort | uniq -c

It works fine and I can get the unique values in a column and its count like

105 Linux
55  MacOS
500 Windows

What I want to do is instead of sorting by the column value names (which in this example are OS names) I want to sort them by count and possibly have the count in the second column in this output format. So It will have to look like:

Windows 500
MacOS   105
Linux   55

How do I do this?

+4  A: 

Use:

cut -f <col_num> <filename>
    | sort 
    | uniq -c
    | sort -r -k1 -n
    | awk '{print $2" "$1}'

The sort -r -k1 -n sorts in reverse order, using the first field as a numeric value. The awk simply reverses the order of the columns. You can test the added pipeline commands thus (with nicer formatting):

pax> echo '105 Linux
55  MacOS
500 Windows' | sort -r -k1 -n | awk '{printf "%-10s %5d\n",$2,$1}'
Windows      500
Linux        105
MacOS         55
paxdiablo
I usually do `sort -k1,1` to sort only by the specified field, otherwise lines are sorted by all fields from field 1 to the end of the line.
Hasturkun
+1  A: 

Mine:

cut -f <column_number> <filename> | sort | uniq -c | awk '{ print $2" "$1}' | sort

This will alter the column order (awk) and then just sort the output.

Hope this will help you

SourceRebels
That sorts by name rather than count.
Dennis Williamson
Sure, from sfactor question: "What I want to do is instead of sorting by the column value names"
SourceRebels