ansaurus

Question

getTextContent from Node with whitespace character normalization

Answer 1

+1 A:

XPath cannot replace nodes with strings.

A simple XSLT transformation can carry out this task.

For example:

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="text()">
   <xsl:value-of select="translate(.,'&#xA0;', ' ')"/>
 </xsl:template>

 <xsl:template match="br">
   <xsl:text>&#10;</xsl:text>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the following XML document:

<p>&#xA0;<br/></p>

the wanted result is produced:

<p> 

</p>

Dimitre Novatchev 2010-05-21 13:45:30

This is useful for my future needs. Thanks.

Nayn 2010-05-21 14:36:39

Answer 2

+1 A:

<br> isn't text content, it's an element. I'm not sure what you're looking for. Try just visiting all the text nodes underneath the element (remembering to recursively check element children) and calling getNodeValue();

Adrian Mouat 2010-05-21 13:46:15

This one was simple. The problem was that, getTextContent concatenates all the strings ignoring and <br>. I wrote a small recursive method that inserts spaces in between texts. Thanks.

Nayn 2010-05-21 14:35:49

ansaurus

tags:

views:

answers:

getTextContent from Node with whitespace character normalization

related questions