tags:

views:

54

answers:

2

I have a series of numbers of different lengths (varying from 1 to 6 digits) within some text. I want to equalize the lenghts of all these numbers by padding shorter numbers by zeros.

E.g. The following 4 lines -

A1:11
A2:112
A3:223333
A4:1333
A5:19333
A6:4

Should become padded integers

A1:000011
A2:000112
A3:223333
A4:001333
A5:019333
A6:000004

I am using "sed" and the following combersome expression:

sed -e 's/:\([0-9]\{1\}\)\>/:00000\1/' \
    -e 's/:\([0-9]\{2\}\)\>/:0000\1/' \
    -e 's/:\([0-9]\{3\}\)\>/:000\1/' \
    -e 's/:\([0-9]\{4\}\)\>/:00\1/' \
    -e 's/:\([0-9]\{5\}\)\>/:0\1/'

Is it possible to do this in a better expression than this?

+3  A: 

You can pad it with too many zeros and then keep only the last six digits:

sed -e 's/:/:00000/;s/:0*\([0-9]\{6,\}\)$/:\1/'

Result:

A1:000011
A2:000112
A3:223333
A4:001333
A5:019333
A6:000004

It might be better to use awk though:

awk -F: '{ printf("%s:%06s\n", $1, $2) }'
Mark Byers
Maybe use `{6,}` to avoid trimming numbers initially longer than 6?
gnarf
@gnarf: This doesn't trim numbers initially longer than 6 - it pads them, but either way your suggestion is fine so I'll update the answer. Another way to handle it might be to abort the script.
Mark Byers
That's a brilliant idea for sed expression - pre-stuffing and then trimming!! That serves my purpose exactly. Though awk would do just a well here in the example, the real data that I am working on is not very tabular. -Thanks Mark!
Shriram V
A: 

Here is a perl solution :

 perl -n -e 'split /:/;printf("%s:%06d\n", @_)'

You asked a regular expression, so I looked for the colon to split with a regular expression, but in this case a simple string would suffice.

[pti@os5 ~]$ cat tst.txt | perl -n -e 'split /:/;printf("%s:%06d\n", @_)'
A1:000011
A2:000112
A3:223333
A4:001333
A5:019333
A6:000004
Peter Tillemans