I am looking for suggestions to validate an XML element which holds the date and i need a function in the XSLT to validate whether it is in YYYYMMDD format or not.
The EXSLT Date and Time functions should be useful here. I believe that their parseDate function will return an empty string on failure. It might not get you 100% of what you want, but it should solve some of the harder problems of date parsing (figuring out if days are valid for particular months, while dealing with leap year rules, etc).
Easiest way to do it is by using extension functions of your XSLT-processor.
It depends a bit on how far you want to push on "validation".
You could do this:
<xsl:if test="
string-length(date) = 8
and translate(date, '0123456789', '') = ''
">
<!-- date looks like it could be valid -->
</xsl:if>
You could also make a more thorough check:
<xsl:if test="
string-length(date) = 8
and translate(date, '0123456789', '') = ''
and number(substring(date, 1, 4)) >= 1970
and number(substring(date, 5, 2)) <= 12
and number(substring(date, 7, 2)) <= 31
">
<!-- date looks like it could really be valid -->
</xsl:if>
However, the latter would still allow 20090231. If you want to rule that out, invocation of an extension function of some sort might become inevitable.
You say that you're using Saxon. If this is the recent version (8.x, 9.x), then it's an XSLT 2.0 processor, and as such, it supports the xsl:analyze-string
instruction for parsing strings using regular expressions, and XML Schema primitive datatypes, including xs:date
. So you can use regex to split date into components, then convert the result to ISO 8601 date, try to convert it to xs:date
for validation of month and day (this should handle leap years etc correctly):
<xsl:variable name="date-string" select="..."/>
...
<xsl:analyze-string select="$date-string" regex="^(\d{4})(\d{2})(\d{2})$">
<xsl:matching-substring>
<xsl:variable name="$year" select="xs:integer(regex-group(1))"/>
<xsl:variable name="$month" select="xs:integer(regex-group(2))"/>
<xsl:variable name="$day" select="xs:integer(regex-group(3))"/>
<xsl:variable name="$date-iso" select="concat($year, '-', $month, '-', $day)" />
<xsl:choose>
<xsl:when test="$date-iso castable as xs:date">
<xsl:variable name="$date" select="$date-iso cast as xs:date" />
<!-- $date now contains an xs:date value, which you can work with using XPath 2.0 date functions -->
...
</xsl:when>
<xsl:otherwise>
<!-- $date-string was in YYYYMMDD format, but values for some of components were incorrect (e.g. February 31). -->
...
</xsl:otherwise>
</xsl:choose>
</xsl:matching-substring>
<xsl:non-matching-substring>
<!-- $date-string wasn't in YYYYMMDD format at all -->
...
</xsl:non-matching-substring>
</xsl:analyze-string>