tags:

views:

87

answers:

4

Hi, I'm having data like this:

uniqname1.foo.bar IN A 10.0.0.1  
uniqname1.foo.bar IN TXT "abcdefg"  
uniqname2.foo.bar IN A 10.0.0.2  
uniqname2.foo.bar IN TXT "xyz"  
uniqname3.foo.bar IN A 10.0.0.3  
uniqname4.foo.bar IN A 10.0.0.4`

You get the picture, not every host has a TXT, some do. I try to get a regex which would print out 3 values for hosts that have a TXT, in that case the output would be:

uniqname1.foo.bar 10.0.0.1 abcdefg  
uniqname2.foo.bar 10.0.0.2 xyz`
+2  A: 

I wouldn't use a regular expression for this. You're likely going to run into some files that have things in a different order, completely screwing up your pattern. Instead, create a data structure to hold the records, select the ones with TXT entries, and from the TXT entries look up the A data. Although regular expressions are fun and powerful, hashes are sometimes even more powerful:

use 5.010;

while( <DATA> ) {
    chomp;
    # maybe another normalization step here
    my( $name, undef, $type, $data ) = split;

    $records{$type}{$name} = $data;
    }

foreach my $txt_record ( keys %{ $records{'TXT'} } ) {
    my $txt_data = $records{'TXT'}{$txt_record};
    my $a_data   = $records{'A'}{$txt_record};

    say join ' ', $txt_record, $a_data, $txt_data;
    }

__DATA__
uniqname1.foo.bar IN A 10.0.0.1
uniqname1.foo.bar IN TXT "abcdefg"
uniqname2.foo.bar IN A 10.0.0.2
uniqname2.foo.bar IN TXT "xyz"
uniqname3.foo.bar IN A 10.0.0.3
uniqname4.foo.bar IN A 10.0.0.4
brian d foy
Seems silly to "use 5.010" and use "say" (which I assume is a perl 5.10ism) when you could easily just use print... unless there is something more complicated that I'm missing?
xyld
Generally I agree that it's silly to use 5.010 just to get say, but in this case it's handy to get the newline at the end of the print when I use join. I could have also set $\, had another print, etc. It hardly matters to the answer though.
brian d foy
Awesome, you're right, I can't always predict the order, makes alot more sense.
Felix007
Odd that folks make a point that order cannot be predicted (in this case it can), yet order is implied during the split variable assignment. The only issue with presuming order here is where you presume it... in the regex (easy to miss) or in the split (easier to see).
ericslaw
@ericslaw: Order of lines and the structure of a particular line are different things.
brian d foy
@brian: I agree with your comment to my finally deleted answer (I thought I had deleted that after voting your answer up but apparently I had forgotten).
Sinan Ünür
A: 

In perl:

while ($s =~ m/^([\w.]+) IN A ([\d.]+)(?:\r|\r\n|\n)$1 IN TXT "([\w+])"/m) {
    print "$1 $2 $3\n";
}

Where $s is your data blob you have above.

I haven't tested it, but the above is close.

xyld
Perl 5.10 has the \R character class to mean "generalized line ending" so you don't have to worry about \r and \n. :)
brian d foy
This is a pretty fragile regex that depends on the input being exactly a TXT record following the A record for the same entry. It's bound to break on some zone files because of that.
brian d foy
Completely agree. To be honest, I'm not 100% familiar with the context. Assuming response from DNS server, but that's all I know.
xyld
Btw, I up'd your comment on \R, that's pretty convenient.
xyld
I always assume things will get messed up. I'm almost never wrong in that assumption, but I'm assuming that sometimes it is. :)
brian d foy
A: 

Do you really need to parse the textual zone data? Why not query for that data programmatically with Net::DNS?

For example:

use Net::DNS;
my $res = Net::DNS::Resolver->new;
my $txtquery = $res->query("example.com", "TXT");
my $aquery = $res->query("example.com", "A");

if ($txtquery and $aquery) {
    ($txtquery->answer)[0]->print;
} else {
    print "query failed: ", $res->errorstring, "\n";
}
Ether
I can't, I need a consistent state of the zone.This might not be the case for serialized queries as it's loadbalanced between a couple of resolvers.I trigger a `dig AXFR @dns foo.bar.` to get the data in a single query.
Felix007
A: 

another way

#!/usr/bin/perl
while (<>) {
    @F = split ' ', $_ ;
    if ($F[2] eq 'A') {
        $a{$F[0]} = $a{$F[0]} . ' ' . $F[$#F];
    }
    if ($F[2] eq 'TXT') {
        print "$F[0] $a{$F[0]} $F[$#F] \n";
    }
}

output

$ perl test.pl file
uniqname1.foo.bar  10.0.0.1 "abcdefg"
uniqname2.foo.bar  10.0.0.2 "xyz"
ghostdog74