views:

243

answers:

3

Looking for a column formatting script, I have a feeling this could be a one line awk. Ideally, a small shell script is all I am after.

The data is tab separated, each cell in each row is of variable length, and of course, may have spaces in it.

So we have something like this

dasj    dhsahdwe dhasdhajks ewqhehwq dsajkdhas
e dward das dsaw das daswf
fjdk    ewf jken dsajkw dskdw
hklt    ewq vn1 daskcn daskw

Should end up something like this:

dasj       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
e dward    das        dsaw       das        daswf     
fjdk       ewf        jken       dsajkw     dskdw     
hklt       ewq        vn1        daskcn     daskw     

Ideally, being able to adjust the amount of hard spaced between each one. Even better if it looks on a column by column basis, so leading short cells do not all get the same right padding.

Not ideal:

1       dhsahdwe   dhasdhajks ewqhehwq   dsajkdhas 
2       das        dsaw       das        daswf     
3       ewf        jken       dsajkw     dskdw     
4       ewq        vn1        daskcn     daskw     

Ideal:

1  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas 
2  das       dsaw        das       daswf     
3  ewf       jken        dsajkw    dskdw     
4  ewq       vn1         daskcn    daskw     
A: 

In un-obsfucated Perl:

#!/usr/bin/perl -w

use strict;

my (@data, @length);
while (<>) {
    chomp;
    my @line = split(/\t/);
    foreach my $i (0 .. $#line) {
        my $n = length($line[$i]);
        $length[$i] = $n if (!defined($length[$i]) || $n > $length[$i]);
    }
    push(@data, [ @line ]);
}

$length[$#length] = 0; # no need to pad the last column
my $fmt = join("  ", map { "%-${_}s" } @length) . "\n";
foreach my $ref (@data) {
    printf $fmt, @$ref;
}
Alnitak
+1  A: 

Here you go. Tested with "gawk".

BEGIN {
    FS = "\t";
    # max: Column width
    # fpl: Fields per line
    # data: Fields in every line
}
 { # Note the blank before this brace
    fpl[FNR] = NF;
    for (i=1; i<=NF; i++) {
        data[FNR, i] = $i;
        if (length($i) > max[i]) {
            max[i] = length($i);
        }
    }
}
END {
    for (l=1; l<=length(fpl); l++) {
        for (i=1; i<=fpl[l]; i++) {
            fmt = "%-" max[i] "s";
            if (i > 1) {
                printf " "; # This goes between columns
            }
            printf fmt, data[l, i];
        }
        print;
    }
}
Aaron Digulla
A: 

If you're on a BSD-derived OS (including Mac OS X), column(1) and its -t option might do what you want:

% column -t coltest                                                               
dasj  dhsahdwe  dhasdhajks  ewqhehwq  dsajkdhas
e     dward     das         dsaw      das        daswf
fjdk  ewf       jken        dsajkw    dskdw
hklt  ewq       vn1         daskcn    daskw
yangyang