tags:

views:

119

answers:

3

I need to count number of words including special characters like % ,$ in a particular section of an XML document.

I need to write this in Perl script using regular expressions.

Anyone has any suggestions on where I can start to look for more info as this is my first perl script.

  1. I need help with isolating the section and its data.
  2. I will probably use that data to pass to a subroutine to count words.
name
desc
address
line1
line2
line3

In the example above, I need to capture address and all the lines inside it and build a string that is going to be counted.

+1  A: 

Try this...

my $counter = 0;
$counter++ while ($string =~ m/[\S]+/g);

This will give you the count of words (groups of characters in between whitespace) and will include special characters such as %, $ if they are separated by whitespace from other words.

Ryan Berger
A: 

Providing you have the text in a string already, you can try this:

my $counter = 0;
my @words = split " ", $string;
for my $word (@words) {
    $counter++ if ($word =~ /\W/);
}
print $counter;
Zhang18
+3  A: 

Aha. You want to parse XML. Use an XML parser, for example XML::Twig. Here is an introduction.

Svante