You could write a recursive template to do this, working through the characters in the string one by one, testing them and changing them if necessary. Something like:
<xsl:template name="normalizeName">
<xsl:param name="name" />
<xsl:param name="isFirst" select="true()" />
<xsl:if test="$name != ''">
<xsl:variable name="first" select="substring($name, 1, 1)" />
<xsl:variable name="rest" select="substring($name, 2)" />
<xsl:choose>
<xsl:when test="contains('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ:_', $first) or
(not($first) and contains('0123456789.-', $first))">
<xsl:value-of select="$first" />
</xsl:when>
<xsl:otherwise>
<xsl:text>_</xsl:text>
</xsl:otherwise>
</xsl:choose>
<xsl:call-template name="normalizeName">
<xsl:with-param name="name" select="$rest" />
<xsl:with-param name="isFirst" select="false()" />
</xsl:call-template>
</xsl:if>
</xsl:template>
However, there is shorter way of doing this if you're prepared for some hackery. First declare some variables:
<xsl:variable name="underscores"
select="'_______________________________________________________'" />
<xsl:variable name="initialNameChars"
select="'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ:_'" />
<xsl:variable name="nameChars"
select="concat($initialNameChars, '0123456789.-')" />
Now the technique is to take the name and identify the characters that aren't legal by replacing all the characters in the name that are legal with nothing. You can do this with the translate()
function. Once you've got the set of illegal characters that appear in the string, you can replace them with underscores using the translate()
function again. Here's the template:
<xsl:template name="normalizeName">
<xsl:param name="name" />
<xsl:variable name="first" select="substring($name, 1, 1)" />
<xsl:variable name="rest" select="substring($name, 2)" />
<xsl:variable name="illegalFirst"
select="translate($first, $initialNameChars, '')" />
<xsl:variable name="illegalRest"
select="translate($rest, $nameChars, '')" />
<xsl:value-of select="concat(translate($first, $illegalFirst, $underscores),
translate($rest, $illegalRest, $underscores))" />
</xsl:template>
The only thing you have to watch out for is that the string of underscores needs to be long enough to cover all the illegal characters that might appear within a single name. Making it the same length as the longest name you're likely to encounter will do the trick (though probably you could get away with it being a lot shorter).