In general, you don't. XML/HTML parsing is hard enough without trying to do it concisely, and while you may be able to hack together a solution that succeeds with a limited subset of XML, eventually it will break.
Besides, there are many great languages with great XML parsers already written, so why not use one of them and make your life easier?
I don't know whether or not there's an XML parser built for awk, but I'm afraid that if you want to parse XML with awk you're going to get a lot of "hammers are for nails, screwdrivers are for screws" answers. I'm sure it can be done, but it's probably going to be easier for you to write something quick in Perl that uses XML::Simple (my personal favorite) or some other XML parsing module.
Just for completeness, I'd like to note that if your snippet is an example of the entire file, it is not valid XML. Valid XML should have start and end tags, like so:
<netlist>
<net NetName="abc" attr1="123" attr2="234" attr3="345".../>
<net NetName="cde" attr1="456" attr2="567" attr3="678".../>
....
</netlist>
I'm sure invalid XML has its uses, but some XML parsers may whine about it, so unless you're dead set on using an awk one-liner to try to half-ass "parse" your "XML," you may want to consider making your XML valid.
In response to your edits, I still won't do it as a one-liner, but here's a Perl script that you can use:
#!/usr/bin/perl
use strict;
use warnings;
use XML::Simple;
sub usage {
die "Usage: $0 [NetName] ([attr])\n";
}
my $file = XMLin("file.xml", KeyAttr => { net => 'NetName' });
usage() if @ARGV == 0;
exists $file->{net}{$ARGV[0]}
or die "$ARGV[0] does not exist.\n";
if(@ARGV == 2) {
exists $file->{net}{$ARGV[0]}{$ARGV[1]}
or die "NetName $ARGV[0] does not have attribute $ARGV[1].\n";
print "$file->{net}{$ARGV[0]}{$ARGV[1]}.\n";
} elsif(@ARGV == 1) {
print "$ARGV[0]:\n";
print " $_ = $file->{net}{$ARGV[0]}{$_}\n"
for keys %{ $file->{net}{$ARGV[0]} };
} else {
usage();
}
Run this script from the command line with 1 or 2 arguments. The first argument is the 'NetName'
you want to look up, and the second is the attribute you want to look up. If no attribute is given, it should just list all the attributes for that 'NetName'
.