tags:

views:

95

answers:

3

Hello, I need to pull out the content out of two paragraph tags and break it with a <br /> tag. The input is like so

<p>
Yay
</p>
<p>
StackOverFlow
</p>

It needs to be like

<p>
Yay <br />
StackOverflow
</p>

What I have so far is <p><?php preg_match('/<p>(.*)<\/p>/', $content, $match); echo($match[1])."..."; ?></p> Which pulls the first paragraph tag only:

<p>
Yay...
</p>

Also, is it possible to set a character limit? A max of 40 characters for example from both of the paragraphs or would I have to use substr?

Thanks!

So it turned out to be:

<?php $content = preg_replace('/<\/p>\s*<p>/', '<br/>', $content);  echo substr("$content",0,180)."..."; ?>
A: 

My advice, Regex can only go so far. See one of my posts here: http://stackoverflow.com/questions/1236915/extracting-text-fragment-from-a-html-body-in-net

It has string truncation regex too.

o.k.w
+6  A: 

Do yourself a favor and use a HTML parser (DOMDocument::loadHTML for example). It's easier and less fragile.

Lukáš Lalinský
+4  A: 

I think you're making it more complicated than it needs to be. Given that you want to collapse:

<p>Yay</p><p>StackOverFlow</p>

into:

<p>Yay<br />StackOverflow</p>

Then just substitute instances of </p><p> for <br>: preg_replace('/<\/p>\s*<p>/', '<br/>', $input).


In general, however, note that use of regular expressions for this kind of complex parsing is fraught with peril. More succinctly:

"Some people, when faced with a problem, think, 'I know, I'll use regular expressions.' Now they have two problems." -- Jamie Zawinski

John Feminella
maybe that should be `</p>\s*<p>`, it looks like there might be a newline between them
Kip
See what I meant about being fraught with peril? ;) Thanks for the catch, Kip.
John Feminella
This assumes he wants to replace EVERY </p><p> with a <br>. Is that the case?
Jay
For that particular instance, yes. Since it is a small chunk of content, that's perfectly feasible.
HelpAppreciated