tags:

views:

59

answers:

4

Hi there!

I'm processing a manpage in nroff format with awk to extract the options to each command... I figured out that the options start with \fB, followed by the actual option, and maybe \fP and option arguments and so on...

Example:

\fB\-\-author\fR

I started writing an awk-script, specifing FS = "\fB" ... well, it didn't work... I tried to escape the \, switching to FS = "\\fB" but didn't work either... what am I doing wrong?

Thx, Oliver

+2  A: 

It looks like you can accomplish this with 4 backslashes:

$ echo "1\z2\z3" | awk 'BEGIN { FS = "\\\\z" } ; {print $3 $1}'
31

When bash parses this, it should unescape the 4 backslashes to 2 literal backslashes; then awk will unescape those 2 backslashes to a single literal backslash.

Mark Rushakoff
Correct, you need to escape the backslash twice since the quotes ("") remove one escape.
Aaron Digulla
A: 

The field separator FS is for CSV-like data. In your case, find the options for a filter and then remove the parts that you don't want:

/\\fB/ { ... process option ...}
Aaron Digulla
Selecting the correct line is only part of the story. Getting the interesting field is what the OP wants.
A: 

This is my script:

BEGIN {
    FS = "\\f." # "\\\\f." didn't work either
}

{
    print $2
}

This is the input

\fB-o\fP

Where I want $2 to be -o. But it just won't work.

Trollhorn
A: 

I think I remember running into this once.

The real problem was that some versions of awk insist on FS being a single character.

The way around it, as I recall, was to manually pull the file into GNU Emacs, edit the multicharacter FS down to one character that wasn't used anywhere else in the file, awk that with the appropriate FS, then manually repair it afterwards.

You MIGHT be able to automate this with a couple of sed scripts, one to do the initial recoding, and one to repair it, with the awk step in the middle.

John R. Strohm