tags:

views:

126

answers:

5

I want to convert

<p>Code is following</p>
<pre>
&lt;html&gt;<br>&lt;/html&gt;
</pre>

to

<p>Code is following</p>
<pre>
&lt;html&gt;
&lt;/html&gt;
</pre>

I don't know how to write regular expression for replace between pre tag in PHP.

I tried this code http://stackoverflow.com/questions/1517102/replace-newlines-with-br-tags-but-only-inside-pre-tags

but it's not working for me.

A: 

You shouldn't use regex to match html tags because it is theoretically impossible.

There are some php librarys for html parsing out there, a quick search on google showed this. http://simplehtmldom.sourceforge.net/

Try to get the code between the "pre" tags and use a simple regex on this.

SchlaWiener
Come on. That's simply isn't true. For something as simple as replacing a <br> tag, a quick bit of regex is just what the doctor ordered. He is not *parsing* HTML, he is just replacing a <br> tag with a new line! I challenge you to find a HTML file where you can't reliably replace <br> by a new line with regex alone.
Sylverdrag
Since the task is to replace <br> only in <pre> tags I asume that the content comes from user input. And you don't know the text the user enters so you cannot reliably desing a regex that work 100% of the time. i.e. the user could insert some <pre> tags himself.Also have a look here: http://stackoverflow.com/questions/2171817/qt-regex-matches-html-tag-innertext
SchlaWiener
(?<=<pre[^>]*?>(?!</?pre>).*)(<br\s*/?>) - Now, show me what kind of wacko HTML will prevent me from matching <br> inside a <pre> tag with this regex. Don't tell me, show me. This expression will not break in many cases that would throw even the most flexible HTML parsers off. I keep reading the claim that regex can't work on HTML, but no evidence ever seem to be provided. Just because you can't PARSE HTML with regex doesn't mean you can't safely extract bits and pieces. If you look inside the source code of the .NET HTML parser library...it's full of regex!
Sylverdrag
I don't want to debate about this forever but try your regex with this snippet: http://pastebin.com/nRX4rYEGWithout knowing the rules of the language there are always some backdoors. A regex that works 99.9% of the time is often a good choise (and I often do it myself. for instance if I want to match ip addresses I sometimes use (\d+\.){3}\d+ because that get's the job done even if it's wrong). But in the web you can't trust the input users make. And a big number of XSS exploits happen because developers don't do properly user input checking (partly with wrong regex statements).
SchlaWiener
Oups, I forgot a non capturing group. Should have been (?<=<pre[^>]*?>(?:(?!</?pre>).)*)(<br\s*/?>). That would work, except for the bit in the HTML comment (since the goal is to fix the display inside <pre> elements, it wouldn't matter). In the present context, replacing a <br> with a line break, regex is a pretty safe and efficient way to do the job. The consequences for missing a <br> or replacing one <br> too many are pretty much nil. Anyway, I think we agree on the main points. There is a point past which using regex is more trouble than it's worth.
Sylverdrag
+3  A: 

Which answer are you using code from?

Assuming it was the accepted answer, just reverse the preg_replace() line as follows;

$parts[$idx] = preg_replace('#<br\s*/?>#', "\n", $part);
TheDeadMedic
A: 

Try this:

$newtext = preg_replace('@<pre>.*<br[/>]*?>?</pre>@si','\n',$text);
turbod
+1  A: 

Oh my God. Here we go again.

Have you tried using an XML parser instead?

Time Machine
THis is not an answer. You may have some valid points in there, though.
polygenelubricants
Why do you think I made it Community Wiki?
Time Machine
What's with the cool text effects at the bottom?
Pez Cuckow
Instead of copying and pasting an entire answer from elsewhere without attribution, just use a link.
Bill the Lizard
A: 
if (preg_match("/<pre>.*(<br(|\s*\/)>).*<\/pre>/m", $str)) {
    $str = preg_replace("/(<br(|\s*\/)>)/", "\n", $str);
}

Works just the same. Replaces <br>, <br/>, <br /> only when found inside <pre>...</pre>

N. Lucas