I’m processing an XML file that, simplified, looks something like this:
<resources>
<resource id="a">
<dependency idref="b"/>
<!-- some other stuff -->
</resource>
<resource id="b">
<!-- some other stuff -->
</resource>
</resources>
The XSLT stylesheet must process a particular resource that we’re interested in, which I will call the root resource, and all recursive dependencies. Dependencies are other resources, uniquely identified by their id
attribute.
It doesn’t matter if a resource is processed twice, although it’s preferable to process each required resource only once. It also doesn’t matter what order the resources are processed in.
It’s important that only the root resource and its recursive dependencies are processed. We can’t just process all the resources and be done with it.
A naïve implementation is as follows:
<xsl:key name="resource-id" match="resource" use="@id"/>
<xsl:template match="resource">
<!-- do whatever is required to process the resource. -->
<!-- then handle any dependencies -->
<xsl:apply-templates select="key('resource-id', dependency/@idref)"/>
</xsl:template>
This implementation works fine for the example above, as well as in many real-world cases. It does have the disadvantage that it often processes the same resource more than once, but as stated above that’s not hugely important.
The problem is that sometimes resources have cyclic dependencies:
<resources>
<resource id="a">
<dependency idref="b"/>
<dependency idref="d"/>
</resource>
<resource id="b">
<dependency idref="c"/>
</resource>
<resource id="c">
<dependency idref="a"/>
</resource>
<resource id="d"/>
</resources>
If you use the naïve implementation to process this example, and you start by processing a, b or c, you get infinite recursion.
Unfortunately I can’t control the input data and in any case cyclic dependencies are perfectly valid and allowed by the relevant specification.
I’ve come up with various partial solutions, but nothing that works in all cases.
The ideal solution would be a general approach to preventing a node from being processed more than once, but I don’t think that’s possible. In fact, I suspect this whole problem is impossible to solve.
If it helps, I have most of EXSLT available (including functions). If necessary I can also pre-process the input with any number of other XSLT scripts, although it’s preferable not to do excessive pre-processing of resources that won’t end up in the output.
What I can’t do is switch to processing this with another language (at least not without substantial re-engineering). I also can’t use XSLT 2.0.
Any ideas?