tags:

views:

23

answers:

1

Hi,i have a small problem i have xome xml with a cdata section. This CDATA section contains fragments of HTMl. I would like to extract some of the data inside this CDATA element. Right now i have a XSLT transformation that outputs the rest of the document as HTMl, but i need only a small part of the CDATA HTML, not the entire part - e.g. a my Title tag. How to do this?

+1  A: 

XSLT won't read the CDATA section as anything other than text. You'll need to pre-parse your data before handing it over to XSLT. You could use a preparsing script (written in Python, PHP, Perl, VB, whatever) and then do one of (but not limited to) the following:

  • remove the CDATA tags and allow XSLT to handle the undesired content
  • move the <title> tag to a XSLT accesible place outside the CDATA tags
  • maybe using Beautiful Soup in Python (or a cthulhu-inducing regex) get the desired value out of the CDATA section, and pass the desired value as a parameter to XSLT
Jweede
@Jweede: Srsly. The regex did not deserve mentioning.
Tomalak