views:

47

answers:

1

I often encounter EDI messages in various plain text formats, for example the format:

HEAD[customer,8][date,8][reference,10]
[lineno, 3][product, 8][quantity, 3][currency, 3][price, 10]...

..resulting in messages like this:

HEAD1122334420091031   LINDAHL
00100004711010USD0000234055
00200004712005USD0000004543
...

Reading the above dump obviously requires focus, and I often find myself losing track of columns and fields. It would be nice to have a way of expressing the grammar of the message and getting a marked-up file (for example in HTML).

It is of course possible to do this with custom-made scripts in any language, but I'm curious: Is there a generic thing for transforming plain text, something like what XSLT does with XML?

+2  A: 

Looks like a job for awk. It was designed exactly to parse text files like that. It's rule-based, exactly like XSLT. It's already installed on your Unix box - just man awk.

alex tingle