views:

41

answers:

4

Hi all, I may need to implement this sometime in the future, but I think the trigger for the question now is mainly curiosity.
I thought of how to write a text editor to a web site I'll build soon, and saw this site's (and other's) way, so I thought - isn't it a bit too complicated? If tags should be used from the first place, why not let users use HTML tags? The only reason I can think of is HTML injection which I don't know much about, but it sounds like an easy issue to solve, isn't it?

Thank you.

+2  A: 

Historically, systems like BBCode were designed to limit available formatting elements to things that would not break the layout of the site, but now, with more mature and smarter HTML parsers, it's not necessary to invent a new markup language just to bar certain un-safe HTML tags.

The current main reason I've seen is that HTML is foreign to most users, and the HTML substitutes are aimed at providing a simplified version of the formatting directives an every-day user would need.

Mark Trapp
+4  A: 

Simply because not all of your users will know HTML. *bold text* is a lot more easy to understand (and read in it's raw form) than <b>bold text</b>. Especially if you get into links.

The reason we use Markdown, Textile and the rest is to provide a nice alternative that's accessible to more users.

Of course you can still provide the ability to use HTML to your users (it's in the Markdown spec) but you'll have to do a lot of checking to make sure there's nothing malicious going on - for example, blocking <script>, <iframe>, large images, javascript in the form <a href="javascript:alert("...");"> etc.

Ross
@Ross: ++ Thanks for the specific examples for HTML injection.
Oren A
+2  A: 

There are several reason why you should not use HTML tags in such an editor:

1) It might be less complex for the user if you introduce an own reduced tag set

2) HTML Injection: There is a big risk of dangerous HTML code getting injected.

If you really want to allow HTML code you have to be very careful.

Ben
+1  A: 

HTML script injection is most emphatically not an easy problem to solve. HTML is a fairly complicated, non-regular language - detecting all possible vulnerabilities is a really hard problem. Many sites have tried, and failed. It's easier, from a vulnerability-prevention POV, to just prohibit HTML entirely, or allow only a small subset of tags.

Michael Petrotta