views:

61

answers:

2

I am using

awk '{ printf "%s", $3 }'

to extract some field from a space delimited line. Of course I get partial results when the field is quoted with free spaces inside. May any body suggest a solution please?

+1  A: 

This is actually quite difficult. I came up with the following awk script that splits the line manually and stores all fields in an array.

{
    s = $0
    i = 0
    split("", a)
    while ((m = match(s, /"[^"]*"/)) > 0) {
        # Add all unquoted fields before this field
        n = split(substr(s, 1, m - 1), t)
        for (j = 1; j <= n; j++)
            a[++i] = t[j]
        # Add this quoted field
        a[++i] = substr(s, RSTART + 1, RLENGTH - 2)
        s = substr(s, RSTART + RLENGTH)
        if (i >= 3) # We can stop once we have field 3
            break
    }
    # Process the remaining unquoted fields after the last quoted field
    n = split(s, t)
    for (j = 1; j <= n; j++)
        a[++i] = t[j]
    print a[3]
}
schot
It is quite complex solution. If there is no simple *one line* solution, I'd go for perl
mmonem
+2  A: 

show your input file and desired output next time. To get quoted fields,

$ cat file
field1 field2 "field 3" field4 "field5"

$ awk -F'"' '{for(i=2;i<=NF;i+=2) print $i}' file
field 3
field5
ghostdog74
Actually it is the apache web server log. It seems that awk can't do it easily.
mmonem
@mmonem Then this might be useful: http://serverfault.com/questions/11028/do-you-have-any-useful-awk-and-grep-scripts-for-parsing-apache-logs
schot