views:

90

answers:

2

Hi,

I'm writing a small blog module. I want the users to be able to type BBCode. I need to convert that to XHTML to store in the DB, which I managed to do for most of the tags, except for [url].

There are two cases I want to allow:

[url=http://stackoverflow.com/]

which should be converted to

<a href="http://www.stackoverflow.com"&gt;http://www.stackoverflow.com&lt;/a&gt;

and

[url=http://stackoverflow.com/]StackOverflow[/url]

which should be converted to

<a href="http://www.stackoverflow.com" title="StackOverflow">StackOverflow</a>

Sadly, I have been unable to do that. The results where horrible, and I'm wondering if this could be done in one regex or if it has to be split in two.

+2  A: 

Something like this ghastly piece of work should do it:

\[url=([^\]]+)\](?:([^\[]+)\[\/url\])?

Upon matching, this should place the url in $1 and the text in $2 if it has been specified. I didn't test this yet so it could require some tweaking.

Kaivosukeltaja
This does indeed work, just tested it myself. If no text is specified, $2 is undef (in Perl anyways). And it matched both forms of urls fine.
Oz
This works only if you have a conditional statement afterwards, it doesn't work as replacement pattern because of the optional BB end tag.
Lucero
+2  A: 

This should work:

\[url\s*=\s*([^\]]*)]\s*((?:(?!\s*\[/url\]).)*)\s*\[/url\]|\[url\s*=\s*([^\]]*)]

Replacement pattern:

<a href="$1$3" title="$2">$2$3</a>

Tested with this input:

bla [url=http://stackoverflow.com/]StackOverflow[/url] bla
bla [url=http://stackoverflow.com/] bla

Returns:

bla <a href="http://stackoverflow.com/" title="StackOverflow">StackOverflow</a> bla
bla <a href="http://stackoverflow.com/" title="">http://stackoverflow.com/&lt;/a&gt; bla

Note that in any case you may have to add some validation/escaping, as invalid XML characters (", <, > etc.) may "break" the tag contents.

Lucero
This is indeed the way to do it with pure regex.
Kaivosukeltaja
Awesome, this is perfect. Thanks a ton.
Whistle