tags:

views:

67

answers:

2

Hi,

What I am trying to achieve is a sound method for using BBCode but where all other data is parsed through htmlentities(). I think that this should be possible, I was thinking along the lines of exploding around [] symbols, but I thought there may be a better way.

Any ideas?

+1  A: 

Hi,

there is a topic about parsing bbcode:

http://stackoverflow.com/questions/488963/best-way-to-parse-bbcode

If you parse the bbcode than you can do whatever with the rest of input.

MartyIX
A: 

htmlentities() does not parse. Rather, it encodes data so it can be safely displayed in an HTML document.

Your code will look like this:

  1. Parse BB-code (by some mechanism); don't do escaping yet, just parse the input text into tags!
  2. The output of your parser step will be some tree structure, consisting of nodes that represent block tags and nodes that represent plain text (the text between the tags).
  3. Render the tree to your output format (HTML). At this point, you escape plain text in your data structure using htmlentities.

Your rendering function will be recursive. Some pseudo-functions that specify the relationship:

render( x : plain text ) = htmlentities(x)

render( x : bold tag )   = "<b>" . render( get_contents_of ( x )) . "</b>"

render( x : quote tag )  = "<blockquote>" . 
                           render( get_contents_of( x )) .
                           "</blockquote>"

...

render( x : anything else) = "<b>Invalid tag!</b>"

So you see, the htmlentities only comes into play when you're rendering your output to HTML, so the browser does not get confused if your plain-text is supposed to contain special characters such as < and >. If you were rendering to plain text, you wouldn't use the function call at all, for example.

rix0rrr