tags:

views:

479

answers:

3

Hey so what I want to do is snag the content for the first paragraph. The string $blog_post contains a lot of paragraphs in the format:

<p>Paragraph 1</p><p>Paragraph 2</p><p>Paragraph 3</p>

The problem I'm running into is that I am writing a regex to grab everything between the first P tag and the first closing P tag, however it is grabbing the first P tag and the LAST closing P tag which results in me grabbing everything.

Here is my current code:

if (preg_match("/[\\s]*<p>[\\s]*(?<firstparagraph>[\\s\\S]+)[\\s]*<\\/p>[\\s\\S]*/",$blog_post,$blog_paragraph))
     echo "<p>" . $blog_paragraph["firstparagraph"] . "</p>";
else
     echo $blog_post;
A: 

It would probably be easier and faster to use strpos() to find the position of the first

 <p>

and first

</p>

then use substr() to extract the paragraph.

 $paragraph_start = strpos($blog_post, '<p>');
 $paragraph_end = strpos($blog_post, '</p>', $paragraph_start);
 $paragraph = substr($blog_post, $paragraph_start + strlen('<p>'), $paragraph_end - $paragraph_start - strlen('<p>'));

Edit: Actually the regex in others' answers will be easier and faster... your big complex regex in the question confused me...

yjerem
+13  A: 

Well, sysrqb will let you match anything in the first paragraph assuming there's no other html in the paragraph. You might want something more like this

<p>.*?</p>

Placing the ? after your * makes it non-greedy, meaning it will only match as little text as necessary before matching the </p>

Kibbee
+2  A: 

If you use preg_match, use the U flag to make it un-greedy.

preg_match("/<p>(.*)<\/p>/U", $blog_post, &$matches);

$matches[1] will then contain the first paragraph.

Erik Öjebo