views:

32

answers:

1

Given a PDF document, is it possible to generate a XSL-FO (FOP) template?

Obviously, this would be a one-time thing - the generated template would just be a starting point for creating a proper template that pulls in the appropriate data.

For me, the ideal tool for doing so would be a Java-based one and should be executable from the command line or through an ANT task. Failing that, it would be something that runs on Linux and MacOS X.

+1  A: 

I know of no such tool. A PDF without document structure information (Tagged PDF) is much like a scanned page. No semantics. You can't even be sure that you can guess the right places where a paragraph begins or ends. If you have Tagged PDF, you can probably get somewhat further depending on the level of detail in the document structure. But I'm pretty sure you'd never get a satisfying result that way. IMO you're much faster learning XSLT and recreate the document template (i.e. stylesheet) by hand. That gets you good code readability, better semantics and better opportunities for factoring out common elements between similar document types.

Jeremias Märki