tags:

views:

114

answers:

6

I have a string variable that contains a lot of HTML markup and I want to get the last <li> element from it. Im using something like:

$markup = "<body><div><li id='first'>One</li><li id='second'>Two</li><li id='third'>Three</li></div></body>";

preg_match('#<li(.*?)>(.*)</li>#ims', $markup, $matches);
$lis = "<li ".$matches[1].">".$matches[2]."</li>";
$total = explode("</li>",$lis);
$num = count($total)-2;
echo $total[$num]."</li>";

This works and I get the last <li> element printed. But I cant understand why I have to subtract the last 2 indexes of the array $total. Normally I would only subtract the last index since counting starts on index 0. What im i missing?

Is there a better way of getting the last <li> element from the string?

+6  A: 

HTML is not regular, and so can't be parsed with a regular expression. Use a proper HTML parser.

Ignacio Vazquez-Abrams
This is the right approach. Saves a lot of pain.
Max
cool tip. thanks!
JoaoPedro
A: 

From the PHP.net documentation:

If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches[1] will have the text that matched the first captured parenthesized subpattern, and so on.

$matches[0] is the complete match (not just the captured bits)

Powertieke
A: 

You have to extract the second index because you have 2 capturing groupds:

$matches[0]; // Contains your original string
$matches[1]; // Contains the argument for the LI start-tag (.*?)
$matches[2]; // Contains the string contained by the LI tags (.*)

'parsing' (x)html strings is with regular expressions is hard and can be full of unexpected problems. parsing more than simple tagged strings is not possible because (x)html is not a regular language.

you could improve your regex by using (not tested):

 /#<li([^>]*)>(.+?)</li>#ims/
Jacco
that explains the weird result. Didnt quite understand the php documentation for it.But ill use the html parser since this is something ill need to use a few times. Thanks.
JoaoPedro
A: 

strrpos — Find position of last occurrence of a char in a string

Zanthrax
A: 

@OP, your requirement looks simple, so no need for parsers or regex.

$markup = "<body><div><li id='first'>One</li><li id='second'>Two</li><li id='third'>Three</li></div></body>";
$s = explode("</li>",$markup,-1);
$t = explode(">",end($s));
print end($t);

output

$ php test.php
Three
ghostdog74
+1  A: 

If you already know how to use jQuery, you could also take a look at phpQuery. It's a PHP library that allows you to easily access dom elements, just like in jQuery.

Fred