Basically I would like to decode a given Html document, and replace all special chars, such as " " -> " ", ">" -> ">".
In .NET we can make use of HttpUtility.HtmlDecode
.
What's the equivalent function in Java?
Basically I would like to decode a given Html document, and replace all special chars, such as " " -> " ", ">" -> ">".
In .NET we can make use of HttpUtility.HtmlDecode
.
What's the equivalent function in Java?
I have used the Apache Commons StringEscapeUtils.unescapeHTML() for this:
Unescapes a string containing entity escapes to a string containing the actual Unicode characters corresponding to the escapes. Supports HTML 4.0 entities.
I have used the Apache Commons StringEscapeUtils.unescapeHTML() also escapes already present html. For example: <br/> becones <br> Is there any way to prevent this?