tags:

views:

63

answers:

3

If i had the following text in a string:

<h4>Tom</h4>
<p>One Paragraph</p>
<p>Two Paragraph</p>

What code would i need to parse on that string to get an output like this (if i didnt know what was inside the <h4> tag?

 <p>One Paragraph</p>
 <p>two Paragraph</p>

Thanks!

+3  A: 

Use stripios to get the start of </h4>. Add the length of </h4> to the offset and then use substr to get all text after the offset.

Example:

$str = '....Your string...';
$offset = stripos($str, '</h4>');
if ( $offset === false ){
    //error, end of h4 tag wasn't found
}
$offset += strlen('</h4>');
$newStr = substr($str, $offset);

I should point out that if the HTML gets any more complex or you don't control the HTML, you may want to use a HTML parser. It is much more robust and less likely to fail if it (for example) encounters < /h4 > rather than </h4>. However, in this case it is overkill.

Yacoby
+1 if the task is really that limited, that's the best way to go.
Pekka
+3  A: 

You can use strstr to get the substring starting with </h4> and remove the </h4> with substr:

$needle = '</h4>';
$rest = substr(strstr($string, $needle), strlen($needle));

Since PHP 5.3 you can also specify the third parameter before_needle with true:

$rest = strstr($string, $needle, true);

Another way would be to use explode:

list(,$rest) = explode($needle, $string, 2);
Gumbo
+1 for using strstr
Yacoby
+1  A: 

Unless the sample HTML you've posted is particularly representative of your data set, you may find it easier and more reliable to use an HTML parser such as this one.

HTML is notoriously difficult to parse reliably (technically, impossible) with regular expressions, and the parser will give you a very simple means to find the nodes you're interested in.

If the above HTML is all you're interested in then you can craft an appropriate regexp. For anything more general I'd explore the parser route.

Brian Agnew
For all HTML parsing issues I would recommend using a DOM class. There is a good one builtin, but probably way to powerfull for this particular problem. The problem with using something like strstr or the string routines is that they will fail as soon as the HTML structure changes.+1 for the comment of using a HTML parser / DOM Class.
TheGrandWazoo
You mean I didn't get the +1 for the Zappa Gravatar ?
Brian Agnew