views:

69

answers:

4

Hi There - I am hoping that someone can help me. I am not exactly sure how to use the following regex. I am using classic ASP with Javascript

completehtml = completehtml.replace(/\<\!-- start-code-remove --\>.*?\<\!-- start-code-end --\>/ig, '');

I have this code to remove everything between

<\!-- start-code-remove --\> and <\!-- start-code-end --\>

It works perfect up to the point where there is line breaks in the values between start and end code...

How will I write the regex to remove everything between start and end even if there is line breaks

Thanks a million for responding...

Shoud I use the \n and \s characters not 100% sure..

(/\<\!-- start-code-remove --\>\s\n.*?\s\n\<\!-- start-code-end --\>/ig, '');

also the code should not be greedy between <\!-- start-code-remove --\> <\!-- start-code-end --\>/ and capture the values in groups...

There could be 3 or more of these sets...

+1  A: 

Try (.|\n|\r)*.

completehtml = completehtml.replace(/\<\!-- start-code-remove --\>(.|\n|\r)*?\<\!-- start-code-end --\>/ig, '');
RightSaidFred
Wouldn't `[.\n\r]*` be better?
Billy ONeal
@Billy, Tried that. Didn't work for me for some reason.
RightSaidFred
@RightSaidFred: Ah.. now I see why. `.` isn't recognized as a metacharacter inside character classes.
Billy ONeal
Hi RightSaidFred thank you very much for the answer that you have provided... I am more familiar with the /s/S solution below and chose to go with that answer...
Gerald Ferreira
@Gerald - You're welcome. `[\s\S]` looks like a good answer.
RightSaidFred
+2  A: 

Source

There is indeed no /s modifier to make the dot match all characters, including line breaks. To match absolutely any character, you can use character class that contains a shorthand class and its negated version, such as [\s\S].

Billy ONeal
Thanks Billy Appreciate your answer... I went for the answer below, since it had the whole code written out, but I really do appreciate the input
Gerald Ferreira
+1  A: 

The dot doesn't match new lines in Javascript, nor is there a modifier to make it do that (unlike in most modern regex engines). A common work-around is to use this character class in place of the dot: [\s\S]. So your regex becomes:

completehtml = completehtml.replace(
    /\<\!-- start-code-remove --\>[\s\S]*?\<\!-- start-code-end --\>/ig, '');
Aillyn
A: 

Regex support in javascript is not very reliable.

function remove_tag_from_text(text, begin_tag, end_tag) {
    var tmp = text.split(begin_tag);
    while(tmp.length > 1) {
        var before = tmp.shift();
        var after = tmp.join(begin_tag).split(end_tag);
        after.shift();
        text = before + after.join(end_tag);
        tmp = text.split(begin_tag);
    }
    return text;
}
Paulo Scardine
Errr.. I don't know what exactly is unreliable about it. More importantly, I don't know why you would implement this kind of thing in terms of splits and joins -- that's horrendously bad.
Billy ONeal
Hi Paulo thanks for the feedback - my code is properly formatted so I am not affraid that the javascript may break but thanks for the tip
Gerald Ferreira
@Billy: Not all RegExp object properties are supported across the different browser brands and versions. Split/Join tricks are ugly but performs very well.
Paulo Scardine