tags:

views:

767

answers:

3

Is there a way to convert a string to a string that will display properly in a web document? For example, changing the string

"<Hello>"

To

"&lt;Hello&gt;"
+9  A: 

StringEscapeUtils has functions designed exactly for this:

http://commons.apache.org/lang/api-2.4/org/apache/commons/lang/StringEscapeUtils.html

Amber
+1  A: 

That's usually called "HTML escaping". I'm not aware of anything in the standard libraries for doing this (though you can approximate it by using XML escaping). There are lots of third-party libraries that can do this, however. StringEscapeUtils from org.apache.commons.lang has a escapeHtml method that can do this.

Laurence Gonsalves
+1  A: 
public static String stringToHTMLString(String string) {
    StringBuffer sb = new StringBuffer(string.length());
    // true if last char was blank
    boolean lastWasBlankChar = false;
    int len = string.length();
    char c;

    for (int i = 0; i < len; i++)
        {
        c = string.charAt(i);
        if (c == ' ') {
            // blank gets extra work,
            // this solves the problem you get if you replace all
            // blanks with &nbsp;, if you do that you loss 
            // word breaking
            if (lastWasBlankChar) {
                lastWasBlankChar = false;
                sb.append("&nbsp;");
                }
            else {
                lastWasBlankChar = true;
                sb.append(' ');
                }
            }
        else {
            lastWasBlankChar = false;
            //
            // HTML Special Chars
            if (c == '"')
                sb.append("&quot;");
            else if (c == '&')
                sb.append("&amp;");
            else if (c == '<')
                sb.append("&lt;");
            else if (c == '>')
                sb.append("&gt;");
            else if (c == '\n')
                // Handle Newline
                sb.append("&lt;br/&gt;");
            else {
                int ci = 0xffff & c;
                if (ci < 160 )
                    // nothing special only 7 Bit
                    sb.append(c);
                else {
                    // Not 7 Bit use the unicode system
                    sb.append("&#");
                    sb.append(new Integer(ci).toString());
                    sb.append(';');
                    }
                }
            }
        }
    return sb.toString();
}
Sorantis