Hi,
I am working with XPATH, Java and want to extract some text out of one html page.
The text is located under some div with some whitespace characters in between, like
<br>
etc.
I want these to be converted into 'space' and 'newline' respectively while extracting.
The method I am using to extract text is Element.getTextContent() which does not respect whitespace characters.
Could somebody tell me if there is a way to extract text with whitespace normalization OR Extract whole html markup under the 'Node' so that i could replace it by myself. Thanks Nayn