views:

67

answers:

1

I'm working with an asp.net application that produces large PDF documents from HTML. The content is perhaps complex (detailed grid type listings, css styled, running to 40+ pages) compared to typical usage. None of the libraries we've tried are performing adequately. Typically a 40 page document is taking upwards of a minute to render on a powerful multi-core machine.

We are able to decouple the generation from the web application and also pre-generate documents in some cases. Still, the frequency with which content changes requires a faster solution.

So, does anyone have experience of a PDF generation component that can output a content heavy 40 page document in seconds rather than minutes? Or are our expectations unrealistic?

NB: I'd rather not "out" the poorly performing components here as we are seeking support from vendors to make improvements. I've reviewed previously questions posted on StackOverflow and none appear to deal with this type or size of document.

+1  A: 

An option might be to not convert html to PDF and take another approach. We use the ActiveReports reporting tool that generates PDF, its pretty powerful when using sub-reports for multi-dataset reports, and completely integrates with visual studio.

This means that you would need to rebuild the report to produce the same data that you see on-screen. This is sometimes not such a bad thing as you can style up the report specifically for printing.

PDFs can be generated via a back-end service and/or emailed or produced on the fly to the browser.

Mark Redman
I am sure there are other reporting tools out there, its worth checking those out, even SQL Server Reporting services may be of help as that can output PDF too.
Mark Redman
Thanks for the suggestions. Have you tried either option with output of the sort of size we need i.e. styled, text heavy, nested tables/grids of 40+ pages?
Mark Storey-Smith
Banded reporting tools are built for tabular data and subreports are just reports within reports. Given a datasource or 2, 40+ pages wouldnt be an issue at all and like any retriveing of data, optimisation is required. Grouping, Page numbering, Page sizing, Report/Group/Page/Header/Footer etc rules are available. You can add image and various other components on the report with styling. It definately better than html to PDF conversion.
Mark Redman
We've already profiled the process and have isolated the final conversion as being the only "unsolvable" problem. The figures I'm quoting are for the final call to the PDF component, excluding the data retrieval and HTML generation. +1 for the suggestions and mentioning sub-reports... made me realise that in some cases, the larger PDFs are a combination of smaller PDFs. That gives us the option of pre-generating the sub-reports and merging PDFs to create the larger composite docs.
Mark Storey-Smith
Yes, generating separate reports and merging is also a good way of building complex reports (reporting tools dont exist without their own foibles) and there are many libraries and tools that can do the merging etc. Libraries like itextSharp are great for what I call single page composition, that is dropping lines of text and images onto new or existing PDF pages and not that good for tabular data or data that runs across multiple pages.
Mark Redman
Will re-visit this to pad out our findings but basically your answer is correct; we need to switch to native PDF generation, not conversion. SRS for example produces ~50 pages/sec compared to the 1/sec we were getting with conversion.
Mark Storey-Smith