tags:

views:

2303

answers:

4

I'm writing an XSLT template that need to output a valid xml file for an xml Sitemap.

<url>
<loc>
    <xsl:value-of select="umbraco.library:NiceUrl($node/@id)"/>
</loc>
<lastmod>
 <xsl:value-of select="concat($node/@updateDate,'+00:00')"/>
</lastmod>
</url>

Unfortunately, Url that is output contains an apostrophe - /what's-new.aspx

I need to escape the ' to &apos; for google Sitemap. Unfortunately every attempt I've tried treats the string '&apos;' as if it was ''' which is invalid - frustrating. XSLT can drive me mad sometimes.

Any ideas for a technique? (Assume I can find my way around XSLT 1.0 templates and functions)

A: 

Have you tried setting disable-output-escaping to yes for your xsl:value-of element:

<xsl:value-of disable-output-escaping="yes" select="umbraco.library:NiceUrl($node/@id)"/>

Actually - this is probably the opposite of what you want.

How about wrapping the xsl:value-of in an xsl:text element?

<xsl:text><xsl:value-of select="umbraco.library:NiceUrl($node/@id)"/></xsl:text>

Perhaps you should try to translate ' to &amp;apos;

Stephen Denne
Thanks I tried that and also in combination with declaring a variable called $apos with a value of ' and then tried using a replace function to replace $apos with ''' - unfortunately XSLT interprets ''' as ''' which is an unclosed quote again - grrr
Neil Fenwick
+4  A: 

So you have ' in your input, but you need the string &nbsp; in your output?

In your XSL file, replace &apos; with &amp;apos;, using this find/replace implementation (unless you are using XSLT 2.0):

<xsl:template name="string-replace-all">
  <xsl:param name="text"/>
  <xsl:param name="replace"/>
  <xsl:param name="by"/>
  <xsl:choose>
    <xsl:when test="contains($text,$replace)">
      <xsl:value-of select="substring-before($text,$replace)"/>
      <xsl:value-of select="$by"/>
      <xsl:call-template name="string-replace-all">
        <xsl:with-param name="text" select="substring-after($text,$replace)"/>
        <xsl:with-param name="replace" select="$replace"/>
        <xsl:with-param name="by" select="$by"/>
      </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$text"/>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

Call it this way:

<loc>
  <xsl:call-template name="string-replace-all">
    <xsl:with-param name="text" select="umbraco.library:NiceUrl($node/@id)"/>
    <xsl:with-param name="replace" select="&apos;"/>
    <xsl:with-param name="by" select="&amp;apos;"/>
  </xsl:call-template>
</loc>

The problem is &apos; is interpreted by XSL as '. &amp;apos; will be interpreted as &apos;.

Welbog
Thanks - this worked a charm
Neil Fenwick
A: 

Following something along the lines of 2 seperate ideas that spdenne gave me, this seems to work:

    <xsl:variable name="apos">'</xsl:variable>  
    <url>
       <loc>
        <xsl:value-of select="umbraco.library:Replace(umbraco.library:NiceUrl($node/@id),$apos,'&amp;apos;')" disable-output-escaping="yes"/>
       </loc>
       <lastmod>
            <xsl:value-of select="concat($node/@updateDate,'+00:00')"/>
       </lastmod>
    </url>

Outputs /what's-new.aspx

(I had to use the umbraco.library:Replace() function for lack of a replace() in the version of XSLT I'm using)

Neil Fenwick
That's exactly the solution I proposed, albeit with a different library. Good job coming up with it on your own. The explanation for why this needs to be done this way is in my answer.
Welbog
A: 

The simple way to remove unwanted characters from your URL is to change the rules umbraco uses when it generates the NiceUrl.

Edit the config/umbracoSettings.config

add a rule to remove all apostrophes from NiceUrls like so:

<urlReplacing>
    ...
    <char org="'"></char>     <!-- replace ' with nothing -->
    ...
</urlReplacing>

Note: The contents of the "org" attribute is replaced with the contents of the element, here's another example:

<char org="+">plus</char> <!-- replace + with the word plus -->
Myster
Thanks for the tip. I gave that a try on a local dev box and found a small side-effect. The replacing only applies to nodes that are published after the config change. Any existing nodes with those characters need to be republished - its a big complicated site with lots of links in - would have to add loads of URL rewrite rules :(
Neil Fenwick
you can 'republish entire site' (right click the root node in the tree view) of course any bookmarked urls will be wrong but the internal links should be updated. You MAY have to do a "HARD republish" if that doesn't work
Myster