views:

142

answers:

5

Hi everyone. I need your help. Let me tell you what my problem is. I have a text file as follows:

Music 3.6G
Other 254.5M
Videos 4.6G
Games 1.3G
Apps 10.1G

As you can see the file has two columns which consist of directory names and their appropriate sizes.

What i want to do is to sort this file by directory's size in a decreasing order like this:

Apps 10.1G
Videos 4.6G
Music 3.6G
Games 1.3G
Other 254.5M

Is there a way to achieve this? Is there a one-liner command for this?

THANK YOU.

A: 

Check the sort man page.

To sort the file below on the third field (area code):
Jim Alchin 212121 Seattle
Bill Gates 404404 Seattle
Steve Jobs 246810 Nevada
Scott Neally 212277 Los Angeles
$ sort -k 3,3 people.txt> sorted.txt

Sort in descending (reverse) numeric order:
$ sort -nr

Brian
A: 

Using Perl:

perl -nle'$G{$2}=$1 if/(\w+) (\d+\.?\d*)G/;$M{$2}=$1 if/(\w+) (\d+\.?\d*)M/;$K{$2}=$1 if/(\w+) (\d+\.?\d*)K/;END{print"$G{$_} ${_}G"for sort{$b<=>$a}keys%G;print"$M{$_} ${_}M"for sort{$b<=>$a}keys%M;print"$K{$_} ${_}K"for sort{$b<=>$a}keys%K;}' filename

Here, filename is a file that contains the above data. The above one-liner takes care of units G, M and K.

Another shorter implementation using eval:

perl -nle'/(\w+) (\d+\.?\d*)(\w)/;eval"\$\$3{$2} = $1";END{for$u qw(G M K){eval"print\"\$\$u{$_} $_$u\""for sort{$b<=>$a}keys%{$u}}}' filename
Alan Haggai Alavi
i can't install perl :(
GuleLim
What type of system are you using?
Alan Haggai Alavi
More seriously, the script does not handle M as a suffix.
Jonathan Leffler
@Jonathan Leffler: The original post was edited after I posted the one-liner. Anyway, thanks for letting me know.
Alan Haggai Alavi
@Alan: fair enough - changing questions are a problem.
Jonathan Leffler
A: 
sort -n -r -k 2,2 file.txt

The -k 2,2 means use the second field in the file as the sort field. By default sort uses whitespace to separate fields. This may not work if the suffixes on the fields (G in your example for gigabytes) are different.

Matt Bridges
well suffixes on the fields are different :(i haven't written the full list
GuleLim
+1  A: 

You need to normalize the size before sorting. The easiest way to do this would be to use a programming language like Perl or Python, but you have already stated that is not an option (although I find it odd that Perl isn't already on the machine). You can use shell code to normalize that data, but it is a pain in the tuckus:

#!/bin/bash

ECHO=/bin/echo
TR=/usr/bin/tr
BC=/usr/bin/bc

while read dir size; do
    bytes=`$ECHO $size | $TR -d "[A-Z]"`
    case $size in
     *B) bytes=$bytes                                      ;;
     *K) bytes=`$ECHO "$bytes * 1024" | $BC`               ;;
     *M) bytes=`$ECHO "$bytes * 1024 * 1024" | $BC`        ;;
     *G) bytes=`$ECHO "$bytes * 1024 * 1024 * 1024" | $BC` ;;
     *) $ECHO unknown size type                            ;;
    esac
    echo $bytes $dir $size
done < $1

This shell script accepts a filename as an argument and prints out the a normalized size, the directory name, and the size. This makes it easy to sort. To get the original fields back, you can just cut off the first field:

./mk_sortable.sh file_to_sort | sort -nr | cut -f2- -d" "

For those paying attention, yes, I did just write a Schwartzian Transform in shell.

Chas. Owens
A: 

Fundamentally, you have to dehumanize the numbers, sort on the dehumanized numbers, and then remove the dehumanized numbers from the output. While you probably can do it in one line (especially if you write a script to do it for you), I think it will need several lines to be comprehensible.

As noted by Drakosha, How Can I Sort 'du -h' output by size covers the issues quite nicely.

Jonathan Leffler