tags:

views:

1726

answers:

3

Looking for a regexp sequence of matches and replaces (preferably php but doesn't matter) to change this (the start and end is just random text that needs to be preserved)

IN:

fkdshfks khh fdsfsk <!--g1--><div class='codetop'>CODE: AutoIt</div><div class='geshimain'><!--eg1--><div class="autoit" style="font-family:monospace;"><span class="kw3">msgbox</span></div><!--gc2--><!--bXNnYm94--><!--egc2--><!--g2--></div><!--eg2--> fdsfdskh

to this:

OUT:

fkdshfks khh fdsfsk <div class='codetop'>CODE: AutoIt</div><div class='geshimain'><div class="autoit" style="font-family:monospace;"><span class="kw3">msgbox</span></div></div> fdsfdskh

Thanks

+10  A: 

Are you just trying to remove the comments? How about

s/<!--[^>]*-->//g

or the slightly better (suggested by the questioner himself):

<!--(.*?)-->

But remember, HTML is not regular, so using regular expressions to parse it will lead you into a world of hurt when somebody throws bizarre edge cases at it.

Paul Tomblin
No, I want to make the IN become the OUT, exactly how it is.
James Brooks
I don't see any differences other than the comments. Are you going to make us guess?
Paul Tomblin
@James Brooks, the only difference between the IN and the OUT is that IN has comments and OUT not. So what else do you want but stripping the comments?
Gamecat
Sorry, I must have been having a brain block day. This was exactly right.
James Brooks
+1  A: 

Ah I've done it,

<!--(.*?)-->
James Brooks
Yeah, thanks now it all makes sense! </sarcasm>
shylent
That's not as good as mine.
Paul Tomblin
@Paul: It's actually better, because > not preceded by -- doesn't end an HTML comment. The important bit that changed was using a non-greedy, or shortest, match.
Novelocrat
Oh right, I forgot that .*? was non-greedy. Still kind of cheesy to ask a horribly vague question, complain when somebody answers with a regex that strips the comments, and then post your own "strip the comment" answer.
Paul Tomblin
Why all this downvoting? The question was badly written, but this answer seems fine.
cube
Down voting is a bit mean... Thanks.
James Brooks
@cube: because an equivalent answer had already been posted and was rejected by the asker for no good reason.
Konrad Rudolph
I'm giving him an upvote because in spite of his stubborn refusal to explain what he wanted and his refusal to explain what he didn't like about my answer, when you come right down to it, his answer is a tiny bit better than mine.
Paul Tomblin
A: 
preg_replace('/<!--(.*)-->/Uis', '', $html)

This PHP code will remove all html comment tags from the $html string.

Benoa