tags:

views:

75

answers:

3

I have a program-generated HTML file, and this tag is repeating:

<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:5:pgl3">
<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:4:pgl3">
<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:3:pgl3">

How do I get only the first number (5) with a regular expression and ignore other indexes?

A: 

Try, and this is assuming from your question (which is not quite clear) that you want to extract the actual index value from the tag:

$index =~ s/resId1:(\d+):pgl3/$1/g
Adrian Regan
I would tend to agree with @cjac if it is XML parsing you are in fact doing. Again your question is not that clear.
Adrian Regan
A: 

try this :


$index=~ /resId1:(\d+):pg/;
my $value = $1;

so that you get your value in a scalar, whithout modifying your line

benzebuth
+2  A: 

You probably shouldn't be using regular expressions to parse html. Take a look at HTML::TreeBuilder::XPath.

use HTML::TreeBuilder::XPath;
my $tree = HTML::TreeBuilder::XPath->new_from_content(q{
<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:5:pgl3">
<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:4:pgl3">
<table cellspacing="0" cellpadding="0" border="0" id="pt1:pt_region0:0:resId1:3:pgl3">
});
my @id = $tree->findvalues('//table/@id');

my (@part) = split(/:/, $id[0]);

my $number = $part[4];

print("The number I'm looking for is [$number]\n");
cjac
Thank you for not using an actual regex (/:/ excluded).
bowenl2
@cjac. Agreed... however if he is just lifting the values from the XML treated as plain text then a regular expression might be more efficient.
Adrian Regan