tags:

views:

46

answers:

3

Hi

$str = 'some text <MY_TAG> tag <em>contents </em> </MY_TAG> more text ';

My questions are: How to retrieve content tag <em>contents </em> which is between <MY_TAG> .. </MY_TAG>?

And

How to remove <MY_TAG> and its contents from $str?

I am using PHP.

Thank you.

+1  A: 

If MY_TAG can not be nested, try this to get the matches:

preg_match_all('/<MY_TAG>(.*?)<\/MY_TAG>/s', $str, $matches)

And to remove them, use preg_replace instead.

Gumbo
hii .. whats /s for?? thanks for answer
@user187580: The *s* flag makes the `.` match line breaks. See http://php.net/manual/en/reference.pcre.pattern.modifiers.php
Gumbo
A: 

Although the only fully correct way to do this is not to use regular expressions, you can get what you want if you accept it won't handle all special cases:

preg_match("/<em[^>]*?>.*?</em>/i", $str, $match);
// Use this only if you aren't worried about nested tags.
// It will handle tags with attributes

And

preg_replace(""/<MY_TAG[^>]*?>.*?</MY_TAG>/i", "", $str);
Renesis
A: 

You do not want to use regular expressions for this. A much better solution would be to load your contents into a DOMDocument and work on it using the DOM tree and standard DOM methods:

$document = new DOMDocument();
$document->loadXML('<root/>');
$document->documentElement->appendChild(
    $document->createFragment($myTextWithTags));

$MY_TAGs = $document->getElementsByTagName('MY_TAG');
foreach($MY_TAGs as $MY_TAG)
{
    $xmlContent = $document->saveXML($MY_TAG);
    /* work on $xmlContent here */

    /* as a further example: */
    $ems = $MY_TAG->getElementsByTagName('em');
    foreach($ems as $em)
    {
        $emphazisedText = $em->nodeValue;
        /* do your operations here */
    }
}
Kris