tags:

views:

100

answers:

4

I have a large database of XHTML that I wish to render as PDFs and/or RTF using CSS. Is there an off-the-shelf/cheap solution that could do this at scale? Large meaning terabytes. Need something that is robust and good at handling large volumes of files.

+1  A: 

Would Prince XML be what you're looking for?

Prince is a computer program that converts XML and HTML into PDF documents. Prince can read many XML formats, including XHTML and SVG. Prince formats documents according to style sheets written in CSS.

Phoexo
+1  A: 

The technology you use for this is XSLT (XSL-FO). Normally you use XML as source data but XHTML should be valid XML. There are different ways to run this (For example you can use a Cocoon server).

If you are looking for a cheap or free software you can search here: http://www.w3.org/Style/XSL/

Alex Lawrence
+1  A: 

Given that XHTML is an extension from XML, I'd recommend using Apache FOP. It is one of the best pdf conversion tool I have ever used so far.

+1  A: 

This is a difficult problem on the scale you're talking. I suggest looking at http://code.google.com/p/wkhtmltopdf/ for ideas on how you'd do the individual run. However, exec'ing a shell script each time you want to convert a document is probably inadequate for your needs, and so splicing this into some sort of daemon or mass-conversion utility is my suggested approach.

McPherrinM