views:

124

answers:

3

I've got a system that allows the user the option of providing their own XSLT to apply to some data that's been retrieved, as a means of specifying how that data should be presented. However, if the user includes code in the XSLT equivalent to:

<xsl:template match="/">
  <xsl:element name="data">
    <xsl:apply-templates select="." />
  </xsl:element>
</xsl:template>

this causes .NET to infinitely recurse trying to process it, and produces a stack overflow error. I need to be able to trap this before it crashes the app, as the data that's been retrieved is occasionally quite time-consuming to obtain, and the data is lost when this happens.

Is there any way of doing this? I know it's theoretically possible to identify any occurrences of xsl:apply-templates with "." in the select attribute, but this isn't the only way an infinite recursion could happen, I need a way of generically trapping it.

+4  A: 

You're talking about the halting problem. It is undecidable: it is impossible to create an algorithm that can determine whether or not any program (or, in your case, any XSLT script) will halt or continue processing indefinitely for its possible inputs.

Your best bet is to impose time and/or memory limits on the execution of user-provided XSLT scripts. That's how browsers typically handle out-of-control JavaScript, for example.

The drawback is you'll occasionally determine that some legitimate scripts are taking too much time and/or memory, so you'll have to tune your limits carefully.

Welbog
I don't think the halting problem's an issue here; it's a theoretical problem to a generic situation, most cases only have specific circumstances in which they can recurse, template calls in this case. At least that's all I can think of off hand.Timing isn't an issue, it hits a stack overflow exception usually within a few seconds, I just need to find a way to trap the stack overflowing; I was hoping there's a way of limiting the stack size and catching the overflow exception.
Flynn1179
@Flynn1179: Halting problem aside, you can always spawn a new thread to deal with the XSLT and trap any exceptions it generates safely without causing harm to your main thread. My point is that you can't do this detection in advance because of the nature of the halting problem, but there are many ways to deal with run-away processes.
Welbog
Could do.. FireFox seems to handle it OK though, hence the reason I thought there was a known way of doing it. I tried it here:http://www.flynn1179.net/xml/ and FireFox almost instantly threw back `Component returned failure code: 0x80600006`, which is the `NS_ERROR_XSLT_BAD_RECURSION` error.
Flynn1179
@Flynn1179: They're likely doing simple common case detection, like what Dimitre suggests, on top of the more robust wait-and-see solution.
Welbog
I did wonder that, but if they did it's very clever; it even spotted a chain of three templates calling each other that infinitely recurse. Although a 'wait and see' solution would probably do for now, I'm still hoping there's a robust way of identifying where recursion can occur, as I'd like to provide meaningful feedback on exactly why the xslt provided doesn't work rather than simply 'it's infinitely recursing somewhere', or words to that effect.
Flynn1179
@Flynn: Here's a quick test: write a recursive function that calls itself, checks the time and stops if the time is ten minutes from when it started, otherwise calls itself. That is a program that will halt, but will probably be trapped by Firefox's detection.
Welbog
+2  A: 

We "solved" this problem, by creating a separate web service that handles XSLT tranformations. (used Saxon to do the tranformations because it knows XSLT 2 too). If an XSLT script crashes it will only do harm in the xslt-tranformer service (and if the problem is really severe it will be restarted using IIS/mod_mono anyway). And because this service runs in a separate, controlled environment this also solved some of the security risks that might rise with user made XSLTs.

SztupY
A: 

As @welbog correctly notices, the halting problem is undecidable!

One practical measure is to give each transformation a "time-to-live" duration and to consider any transformation that exceeds this duration as "looping excessively". Then take an action (e.g. kill it).

This shouldn't be difficult to implement on a centralized hosting platform.

Dimitre Novatchev