tags:

views:

43

answers:

2

I'm building a small CMS in PHP for a client and something I've noticed that comes up fairly often is a client will enter a bit of HTML in a field without closing his/her tag. I'm wondering if there is some parsing technique to prevent bad HTML from rendering my whole output page in italics because the user forgot to add a closing </i> tag.

I'm not worried about XSS or malicious html, just a forgotten tag here and there as it's the client who is managing the content.

Forgive me if this is a duplicate question, I did some searching, but could not find an appropriate answer.

-J

+1  A: 

You may want to tidy the HTML input from the user so that the dirty HTML can be fixed. Check out the PHP5 Tidy extension to achieve this.

http://devzone.zend.com/article/761

codemeit
tidy_clean_repair($tidy);I like the sound of that.
Jascha
A: 

Why do you not use an IFRAME with the contents the user edited in another page ?

That way only that page is 'at risk', and you can use a scrollable reagon too.

Just a thought ...

Edelcom
Interesting thought. One thing that comes to mind right away is you're ruling out search robots crawling your content.
Jascha
I'm not sure about that. I can't think of a reason why a searchbot would not follow an iframe lead if it's a page on your site.
Edelcom
http://seostep.wordpress.com/category/iframe-vs-search-engines/ There's some info there on why it's not the greatest idea (for SEO)
Jascha
It might work but it wouldn't be fixing the problem.
Rimian