views:

73

answers:

3

Hello.

I did a exercise like this, how do I calculate the # of XML elements collapsed into an array by XML::Simple so I don't have to hard-code the # of elements? I plan to use the code to parse a bigger xml file. I don't want to cout the elements by manual.

Can I use some count to replace the magic numbers, a little like person.count or hobbie.length etc. As I know, I can use this kind of statement in C# conveniently.

#!/usr/bin/perl -w
use strict;
use XML::Simple;
use Data::Dumper;

my $tree = XMLin('./t1.xml');

print Dumper($tree);
print "\n";
for (my $i = 0; $i < 2; $i++) # magic number '2'
{
    print "$tree->{person}->[$i]->{first_name} $tree->{person}->[$i]->{last_name}\n";
    print "\n";
    for (my $j = 0; $j < 3; $j++) # magic number '3'
    {
        print $tree->{person}->[$i]->{hobbie}->[$j], "\n";
    }
    print "\n";
}

Out put:

could not find ParserDetails.ini in C:/Perl/site/lib/XML/SAX
$VAR1 = {
          'person' => [
                      {
                        'hobbie' => [
                                    'bungy jumping',
                                    'sky diving',
                                    'knitting'
                                  ],
                        'last_name' => 'Bloggs',
                        'first_name' => 'Joe'
                      },
                      {
                        'hobbie' => [
                                    'Swim',
                                    'bike',
                                    'run'
                                  ],
                        'last_name' => 'LIU',
                        'first_name' => 'Jack'
                      }
                    ]
        };

Joe Bloggs

bungy jumping
sky diving
knitting

Jack LIU

Swim
bike
run

My Xml source file as below

<Document>
  <person>
    <first_name>Joe</first_name>
    <last_name>Bloggs</last_name>
    <hobbie>bungy jumping</hobbie>
    <hobbie>sky diving</hobbie>
    <hobbie>knitting</hobbie>
  </person>
  <person>
    <first_name>Jack</first_name>
    <last_name>LIU</last_name>
    <hobbie>Swim</hobbie>
    <hobbie>bike</hobbie>
    <hobbie>run</hobbie>
  </person>
</Document>
+5  A: 

Since XML::Simple will produce an array for you, it's easy to count its length.

E.g. $tree->{person} is an array - or rather an array reference (make sure it's one by using ForceArray option of XML::Simple even if there's only 1 person).

  • You can get its length by first de-referencing it into an array itself (using @{} array de-reference): @{ $tree->{person} }

  • Then you use the resulting array in a scalar context which evaluates to the # of elements in the array (In other words, a.lenth/a.count functions in other languages translate to Perl idiom scalar(@a) were the scalar() function is optional if scalar context already applies).

    In this case, the numeric comparison operator "<" will force the scalar context, but if that was not the case you could have used scalar() function.

Example:

# Don't forget ForceArray option of XML::Simple to ensure person and hobbie are array refs
for (my $i = 0; $i < scalar( @{ $tree->{person} } ); $i++) { # scalar() is optional here
    print "$tree->{person}->[$i]->{first_name} $tree->{person}->[$i]->{last_name}\n";
    print "\n";
    for (my $j = 0; $j < @{ $tree->{person}->[$i]->{hobbie} }; $j++) {
        print $tree->{person}->[$i]->{hobbie}->[$j], "\n";
    }
    print "\n";
}

As a note, a somewhat different method of counting the length of a Perl array is the $#a construct, which returns the index of the last element of the array - e.g. 1 less than the amount of elements in the array. I'm not aware of any performance difference between using the two approaches, so if you find both equally readable, use them as appropriate (e.g. if you need to get index of the last element, use $#a; if # of elements, use @a or scalar(@a) as needed).

A good reference is Perl Data Structures Cookbook @perldoc

DVK
+3  A: 
for my $person (@{ $tree->{person} }) {
    print "$person->{first_name} $person->{last_name}\n\n";
    for my $hobby (@{ $person->{hobbie} }) {
      print $hobby, "\n";
    }
    print "\n";
}

and as DVK says, make sure you have ForceArray => [qw/Person Hobby/] in your XMLin options or else things won't work out if you only have one person or any person only has one hobby.

hobbs
+1  A: 

You only need to know the number of items in the array if you are using the 'C' style for loop. Instead, you could use the more perlish version: foreach my $val ( @list )

#!/usr/bin/perl

use strict;
use warnings;

use XML::Simple qw(:strict XMLin);
use Data::Dumper;

my $tree = XMLin('./t1.xml', KeyAttr => { }, ForceArray => [ 'person', 'hobbie' ]);

foreach my $person ( @{ $tree->{person} } ) {
    print "$person->{first_name} $person->{last_name}\n";
    foreach my $hobbie ( @{ $person->{hobbie} } ) {
        print "$hobbie\n";
    }
}

To be safer (and arguably more readable), you might want to check that a <person> has any <hobbie> elements at all, before trying to loop through them:

foreach my $person ( @{ $tree->{person} } ) {
    print "$person->{first_name} $person->{last_name}\n";
    if(my $hobbies = $person->{hobbie}) {
        foreach my $hobbie ( @$hobbies ) {
            print "$hobbie\n";
        }
    }
}
Grant McLean