views:

178

answers:

2

Hi everyone,

I'm fishing for approaches to a problem with XSLT processing.

Is it possible to use parallel processing to speed up an XSLT processor? Or are XSLT processors inherently serial?

My hunch is that XML can be partitioned into chunks which could be processed by different threads, but since I'm not really finding any documentation of such a feat, I'm getting skeptical. It possible to use StAX to concurrently chunk XML?

It seems that most XSLT processors are implemented in Java or C/C++, but I really don't have a target language. I just want to know if a multi-threaded XSLT processor is conceivable.

What are your thoughts?

+2  A: 

Like most programming languages looping is inherently parallelizable as long as you follow a couple rules, this is known as Data Parallelism

  • No mutation of shared state in the loop
  • One iteration of the loop cannot depend on the outcome of another iteration

Any looping constructs could be parallelized in XSLT fairly easily.

With similar rules against mutation and dependencies you really could parallelize most of an XSLT transformation in kind of a task based parallelism. You could break the document into pieces segmented at XSLT command and text node boundaries, assigning an index to each piece and executing each transformation in parallel. Once all the transformations were done you could gather the results in index order and assemble them into the finished document.

joshperry
`<xsl:param>` and `xsl:variable` are immutable. Is there anything else in XSLT that could be mutable? I can't think of anything mutable off-hand.
ndim
ah yes, true; my XSL is a bit rusty. So XSLT should be quite a good target for parallelism using document fragments and loop parallelism.
joshperry
+4  A: 

Saxon: Anatomy of an XSLT Processor, excellent article about XSLT processors, saxon in particular. It covers multithreading.

Saxon by the way is available both for .NET and Java and is one of the best processors available.

Peter Lindqvist