ansaurus

Question

Regular expression replacing only if contained withing a regular expression match?

Answer 1

+3 A:

Given the text:

[*] test1

[list]
[*] test2
[*] test3
[*] test4
[/list]

[*] test5

the regex:

\[\*]\s*([^\r\n]+)(?=((?!\[list])[\s\S])*\[/list])

matches only [*] test2, [*] test3 and [*] test4. But if the [list]'s can be nested, or a more broader set of a BB-like language needs to be parsed, I opt for a proper parser.

To do the replacements, replace the regex I suggested with:

<li>$1</li>

and then replace [list] with <ul> and [/list] with </ul> (assuming [list] and [/list] are only used for lists and are not present in comments or string literals or something).

When running the following snippet:

var text = "[*] test1\n"+
    "\n"+
    "[list]\n"+
    "[*] test2\n"+
    "[*] test3\n"+
    "[*] test4\n"+
    "[/list]\n"+
    "\n"+
    "[*] test5\n"+
    "\n"+
    "[list]\n"+
    "[*] test6\n"+
    "[*] test7\n"+
    "[/list]\n"+
    "\n"+
    "[*] test8";

print(text + "\n============================");
text = text.replace(/\[\*]\s*([^\r\n]+)(?=((?!\[list])[\s\S])*\[\/list])/g, "<li>$1</li>");
text = text.replace(/\[list]/g, "<ul>");
text = text.replace(/\[\/list]/g, "</ul>");
print(text);

the following is printed:

[*] test1

[list]
[*] test2
[*] test3
[*] test4
[/list]

[*] test5

[list]
[*] test6
[*] test7
[/list]

[*] test8
============================
[*] test1

<ul>
<li>test2</li>
<li>test3</li>
<li>test4</li>
</ul>

[*] test5

<ul>
<li>test6</li>
<li>test7</li>
</ul>

[*] test8

A small explanation might be in order:

\[\*]\s* matches the sub string [*] followed by zero or more white space characters;
([^\r\n]+) gobbles up the rest of the line and saves it in match group 1;
(?=((?!\[list])[\s\S])*\[/list]) ensures that every match group 1 must have a sub string [/list] ahead of without encoutering a [list]

EDIT

Or better yet, do as Gumbo suggest in the comment to this answer: match all [list] ... [/list] and then replace all [*] ... in those.

Bart Kiers 2010-05-21 06:05:11

You should do a more systematical approach: Search for every `[list] … [/list]` and for every match replace every occurrence of `[*]` in it.

Gumbo 2010-05-21 06:41:09

@Gumbo, agreed, edited my answer.

Bart Kiers 2010-05-21 06:46:10

Answer 2

+1 A:

Here’s a better approach to Bart K.’s suggestion:

find all [list] … [/list]
for each match, find all [*] in it

This will ensure that only [*] in [list] … [/list] will be replaced.

The code:

str.replace(/\[list]([\s\S]*?)\[\/list]/g, function($0, $1) {
    return "<ul>" + $1.replace(/^ *\[\*] *(.*)/gm, "<li>$1</li>") + "</ul>";
})

Gumbo 2010-05-21 07:06:33

ansaurus

tags:

views:

answers:

Regular expression replacing only if contained withing a regular expression match?

related questions