Hello there,
we're currently generating all our official documents using XSL-FO transformation using .xml files as input and generating .pdfs & basically all the content within these .xml's is either plain text or xhtml. This works perfectly fine for every-day use-cases, but some of our users refer to Microsoft Excel files which our XSL-Fo transformer (Antenna House) cannot handle natively (and afaik, no other one really does that either).
So what we did or are doing as an intermediate, short-term solution is we create images out of the printareas defined by the users and embedded these images within the .pdfs.
However, since these images are obviously not 'searchable' content wise, we were looking down the post-processing step of OCR'ing these .pdfs etc etc, but to my mind, this all goes to deep into the workaround hole.
I had the idea of converting these .xls files to SpreadsheetML and cover that with our xsl-fo stylesheet but looking at the spreadsheetml specs I kinda gave up that hope, too.. at least without throwing several dozen man-months at the implementation.
So, to come to my actual question, how would or do you handle Microsoft Excel files within your xsl-fo driven document generation?
Cheers & thanks, -J