There are three reasons for getting unwanted whitespace in the result of an XSLT transformation:
- whitespace that comes from between nodes in the source document
- whitespace that comes from within nodes in the source document
- whitespace that comes from the stylesheet
I'm going to talk about all three because it can be hard to tell where whitespace comes from so you might need to use several strategies.
To address the whitespace that is between nodes in your source document, you should use <xsl:strip-space>
to strip out any whitespace that appears between two nodes, and then use <xsl:preserve-space>
to preserve the significant whitespace that might appear within mixed content. For example, if your source document looks like:
<ul>
<li>This is an <strong>important</strong> <em>point</em></li>
</ul>
then you will want to ignore the whitespace between the <ul>
and the <li>
and between the </li>
and the </ul>
, which is not significant, but preserve the whitespace between the <strong>
and <em>
elements, which is significant (otherwise you'd get "This is an important*point*"). To do this use
<xsl:strip-space elements="*" />
<xsl:preserve-space elements="li" />
The elements
attribute on <xsl:preserve-space>
should basically list all the elements in your document that have mixed content.
Aside: using <xsl:strip-space>
also reduces the size of the source tree in memory, and makes your stylesheet more efficient, so it's worth doing even if you don't have whitespace problems of this sort.
To address the whitespace that appears within nodes in your source document, you should use normalize-space()
. For example, if you have:
<dt>
a definition
</dt>
and you can be sure that the <dt>
element won't hold any elements that you want to do something with, then you can do:
<xsl:template match="dt">
...
<xsl:value-of select="normalize-space(.)" />
...
</xsl:template>
The leading and trailing whitespace will be stripped from the value of the <dt>
element and you will just get the string "a definition"
.
To address whitespace coming from the stylesheet, which is perhaps the one you're experiencing, is when you have text within a template like this:
<xsl:template match="name">
Name:
<xsl:value-of select="." />
</xsl:template>
XSLT stylesheets are parsed in the same way as the source documents that they process, so the above XSLT is interpreted as a tree that holds an <xsl:template>
element with a match
attribute whose first child is a text node and whose second child is a <xsl:value-of>
element with a select
attribute. The text node has leading and trailing whitespace (including line breaks); since it's literal text in the stylesheet, it gets literally copied over into the result, with all the leading and trailing whitespace.
But some whitespace in XSLT stylesheets get stripped automatically, namely those between nodes. You don't get a line break in your result because there's a line break between the <xsl:value-of>
and the close of the <xsl:template>
.
To get only the text you want in the result, use the <xsl:text>
element like this:
<xsl:template match="name">
<xsl:text>Name: </xsl:text>
<xsl:value-of select="." />
</xsl:template>
The XSLT processor will ignore the line breaks and indentation that appear between nodes, and only output the text within the <xsl:text>
element.