views:

45

answers:

2

I am making a forum posting web application using Java and I want to know :

  1. What are the xml tags that should be parsed and removed when posting like the <script> tag ?
  2. Should i remove the tag and keep the content, or remove the tags with the content ?
  3. what are the regular expression to remove them ?
+5  A: 

You want to allow users to use HTML to format their posts?

  1. Don't make a list of unsafe tags; make a list of safe tags and only accept those
  2. This is up to you
  3. The subject of parsing HTML/XML with regular expressions has been covered before
McDowell
+3  A: 

You can beter ask yourself, which tags to allow. Not which to remove.

You should just keep enough tags so users can express theirself without compromising the site. And maybe you should have a look at BB code, this is designed for this purpose.

Gamecat