We've been looking for ways to HTML encode our JSP pages to counter XSS.
The OWASP site shows How_to_perform_HTML_entity_encoding_in_Java
The article talks about entity encoding the "Big 5" i.e.
21 {"#39", new Integer(39)}, // ' - apostrophe
22 {"quot", new Integer(34)}, // " - double-quote
23 {"amp", new Integer(38)}, // & - ampersand
24 {"lt", new Integer(60)}, // < - less-than
25 {"gt", new Integer(62)}, // > - greater-than
i.e.
<script>
is encoded as
<script>
but the Java code sample included in the article uses numeric reference encoding i.e.
<script></script>
is encoded as
<script></script>
Is there a reason for using character references over entity references? Which is best and why?