ansaurus

Question

Convert style-laden HTML tables to PDF, in .NET 1.1

Answer 1

A:

I don't have any solid answers, but I'll give you two directions to explore, both of which I have used before.

1 - use something like HtmlAgilityPack to cleanse your HTML - you can traverse the DOM and remove styles and classes, which could obviously screw up the layout to a certain degree. It is not clear to me whether you need to retain this styling or not. Then, you could use iTextSharp or an alternate program like HtmlDoc (which also does not support CSS) to render to PDF. We wrote a simple wrapper with a method that takes a URL, and then calls Htmldoc to generate the PDF.

2 - render the HTML server-side using a WebBrowser control, generate an image from that, then convert the image to PDF using PDFsharp or the library of your choice. This will obviously not give you PDFs that you can search or copy text from. There is some pretty good sample code here for converting the rendered page to an image (note: you can get full-height images, not just what you can see without scrolling).

Edit: I don't think the WebBrowser control is available in .NET 1.1.

RedFilter 2008-12-17 16:43:48

Yes I have been trying out .NET 1.1 edition of HtmlAgilityPack as well, but it has some bugs that remove sections of paragraph content which I need to debug on another day.

icelava 2008-12-17 16:50:04

Yes, the styling has to be retained for the HTML tags - those are the ones keeping the tables aligned properly in the first place. So removing them is somewhat the same as the current situation, where they are being ignored.

icelava 2008-12-17 16:51:13

Well if you can live with images in your PDF, I suggest option 2.

RedFilter 2008-12-17 18:09:44

Answer 2

+1 A:

I have found .NET 2.0-based components like ExpertPDF and ABCpdf do a fairly good job interpreting the CSS styles and aligning the tables properly in PDF. Right now I am suggesting to my colleagues the use of a separate .NET 2.0 web service that can use such components, which will be informed by the ASP.NET 1.1 web application to go ahead and scrape a generated web page that is essentially the report in HTML view.

UPDATE:

This is the answer as it is the recommended approach provided to the application team.

icelava 2008-12-18 04:00:34

ansaurus

tags:

views:

answers:

Convert style-laden HTML tables to PDF, in .NET 1.1

related questions