tags:

views:

2021

answers:

5

Hi all,

I need help on regex or preg_match because I am not that experienced yet with regards to those so here is my problem.

I need to get the value "get me" but I think my function has an error. The number of html tags are dynamic. It can contain many nested html tag like a bold tag. Also, the "get me" value is dynamic.

<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>(.*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

Please help me thanks :)

+1  A: 
<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

That should do the trick

takete.dk
The opening tag should be matched using <$tagname.*?> or <$tagname[^>]*>, not <$tagname ?.*>. As it is, it's greedy and will match a lot further than you hoped if there's more than one closing tag in the string.
Samir Talwar
This one worked. Thanks a lot takete.dk :D
marknt15
@Samir: Thanks for the tip, I will try it.
marknt15
Note that attribute values may contain a plain `>`.
Gumbo
A: 

In your pattern, you simply want to match all text between the two tags. Thus, you could use for example a [\w\W] to match all characters.

function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}
Tomas Lycken
A: 

The following php snippets would return the text between html tags/elements.

regex : "/tagname(.*)endtag/" will return text between tags.

i.e.


$regex="/[start_tag_name](.*)[/end_tag_name]/";
$content="[start_tag_name]SOME TEXT[/end_tag_name]";
preg_replace($regex,$content); 

It will return "SOME TEXT".

Regards,

Web-Farmer @letsnurture.com

A: 

excellent. Nice post. [link text][1]

[1]: http://mycodings.blogspot.com . I'll post this one with little modification to my blog.

Sathish Kumar
A: 

Since attribute values may contain a plain > character, try this regular expression:

$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';

But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXML or DOMDocument.

Gumbo