ansaurus

Question

Answer 1

A:

Check the sort man page.

To sort the file below on the third field (area code):
Jim Alchin 212121 Seattle
Bill Gates 404404 Seattle
Steve Jobs 246810 Nevada
Scott Neally 212277 Los Angeles
$ sort -k 3,3 people.txt> sorted.txt

Sort in descending (reverse) numeric order:
$ sort -nr

Brian 2009-06-28 12:41:06

Answer 2

A:

Using Perl:

perl -nle'$G{$2}=$1 if/(\w+) (\d+\.?\d*)G/;$M{$2}=$1 if/(\w+) (\d+\.?\d*)M/;$K{$2}=$1 if/(\w+) (\d+\.?\d*)K/;END{print"$G{$_} ${_}G"for sort{$b<=>$a}keys%G;print"$M{$_} ${_}M"for sort{$b<=>$a}keys%M;print"$K{$_} ${_}K"for sort{$b<=>$a}keys%K;}' filename

Here, filename is a file that contains the above data. The above one-liner takes care of units G, M and K.

Another shorter implementation using eval:

perl -nle'/(\w+) (\d+\.?\d*)(\w)/;eval"\$\$3{$2} = $1";END{for$u qw(G M K){eval"print\"\$\$u{$_} $_$u\""for sort{$b<=>$a}keys%{$u}}}' filename

Alan Haggai Alavi 2009-06-28 12:43:30

i can't install perl :(

GuleLim 2009-06-28 13:21:51

What type of system are you using?

Alan Haggai Alavi 2009-06-28 13:52:48

More seriously, the script does not handle M as a suffix.

Jonathan Leffler 2009-06-28 14:32:37

@Jonathan Leffler: The original post was edited after I posted the one-liner. Anyway, thanks for letting me know.

Alan Haggai Alavi 2009-06-28 15:02:54

@Alan: fair enough - changing questions are a problem.

Jonathan Leffler 2009-06-28 19:45:21

Answer 3

A:

sort -n -r -k 2,2 file.txt

The -k 2,2 means use the second field in the file as the sort field. By default sort uses whitespace to separate fields. This may not work if the suffixes on the fields (G in your example for gigabytes) are different.

Matt Bridges 2009-06-28 12:46:03

well suffixes on the fields are different :(i haven't written the full list

GuleLim 2009-06-28 12:49:21

Answer 4

+1 A:

You need to normalize the size before sorting. The easiest way to do this would be to use a programming language like Perl or Python, but you have already stated that is not an option (although I find it odd that Perl isn't already on the machine). You can use shell code to normalize that data, but it is a pain in the tuckus:

#!/bin/bash

ECHO=/bin/echo
TR=/usr/bin/tr
BC=/usr/bin/bc

while read dir size; do
    bytes=`$ECHO $size | $TR -d "[A-Z]"`
    case $size in
     *B) bytes=$bytes                                      ;;
     *K) bytes=`$ECHO "$bytes * 1024" | $BC`               ;;
     *M) bytes=`$ECHO "$bytes * 1024 * 1024" | $BC`        ;;
     *G) bytes=`$ECHO "$bytes * 1024 * 1024 * 1024" | $BC` ;;
     *) $ECHO unknown size type                            ;;
    esac
    echo $bytes $dir $size
done < $1

This shell script accepts a filename as an argument and prints out the a normalized size, the directory name, and the size. This makes it easy to sort. To get the original fields back, you can just cut off the first field:

./mk_sortable.sh file_to_sort | sort -nr | cut -f2- -d" "

For those paying attention, yes, I did just write a Schwartzian Transform in shell.

Chas. Owens 2009-06-28 14:32:32

Answer 5

A:

Fundamentally, you have to dehumanize the numbers, sort on the dehumanized numbers, and then remove the dehumanized numbers from the output. While you probably can do it in one line (especially if you write a script to do it for you), I think it will need several lines to be comprehensible.

As noted by Drakosha, How Can I Sort 'du -h' output by size covers the issues quite nicely.

Jonathan Leffler 2009-06-28 14:38:23

ansaurus

tags:

views:

answers:

Using sort to rank a column by its size

related questions