XML spec defines a subset of Unicode characters which are allowed in XML documents: http://www.w3.org/TR/REC-xml/#charsets.
How do I filter out these characters from a String in Java?
simple test case:
Assert.equals("", filterIllegalXML(""+Character.valueOf((char) 2)))