views:

3841

answers:

8

I have a report written in MS Word .doc format, 40+ pages. Based on user input I have to change some lines and print it to PDF format. I am on windows hosting server, that means asp.net and c# language. Therefor I cannot use Word application or printer drivers.

Basically I have two ideas how to get the PDF format.. - to save this .doc to .xml so I could use it as a template, before going on server. I can then access it, modify it after the user's input and then I need to figure out a way to get Word XML file to PDF. - to save it as HTML document, to be able to use it as a template. Then I can access and modify it, and try to iTextSharp it to PDF. Although it's a rather simple report with few images and a few pages of text, iTextSharp has a lot of problems with it, so he just prints the pdf file with 0 bytes, even after I cleaned and simplified the whole html code.

Since the second solution is a little tricky because of the footers and the page numbers that this .doc report has, I guess I would need to go back to the first one.

I am aware of the related posts like: http://stackoverflow.com/questions/159744/converting-ms-word-documents-to-pdf-in-asp-net or http://stackoverflow.com/questions/85404/generating-a-pdf-document-based-on-a-microsoft-word-template but they don't seem to provide much help.

Does anyone have any ideas besides commercial products (from which I have tried Winnovative's library which seemed to work ok)?

+1  A: 

Perhaps you might have some luck doing something like the following:

  1. If you're using office2007 you might be able to use the OpenXML C# library to do the document manipulations.

  2. Once the document maniuplations are complete you could write a small console app that interfaces with ABCPdf which basically just handles the conversion of the documents to PDF. You can also get a free version of ABCPdf if you link back to their page. You can easily install the console app on the server and call it from your web application.

I've used this method before with a good deal of success.

lomaxx
yes, there are lots of programs that basically print to pdf but I doubt that the hosting company will let me run console apps on their server :)
Ivan
+1  A: 

Since there is really nothing to do here if you don't have Word application installed, I was faced the options eater to buy some commercial components like from Aspose for example, or to purchase virtual or dedicated hosting on my server and have Microsoft Office 2007 and Microsoft Office Add-in: Microsoft Save as PDF installed in order to use Microsoft Word 12.0 Object Library. The third solution is to buy web service that does the conversion for me, or to use some free ideas like generate a html page and then let users save it as pdf with this for example. In the end I have chosen to hardcode the report and to do all the formating manually with iTextSharp.

Ivan
A: 

I've written my own PDF library once (only merge and embed Tiff files), but I can tell you it's not something you'd want to do. Isn't there any way to avoid Word all together? Where does the Word report come from?

Jonathan van de Veen
client has delivered me the .doc file, but since I have manually composed the PDF all the formating may not look exactly the same, but everything (the tables, images, horizontal lines, footer and headers, texts) is there.
Ivan
If you can already interpret these objects from the document, then I don't see the problem. You could use something like ITextSharp to generate a PDF file.
Jonathan van de Veen
A: 

There is a possibility to print whatever you can see on screen to pdf, using screen capture and iTextSharp for example. It can be done to work the same like ABCpdf software. If the web site has more pages, we use AddImageToChain for example. There was a slight mishap with this function because it was capturing the vertical scroll bar also.

Ivan
A: 

We use the Aspose Words and the Aspose PDF components. You just have the create a document in Word that contains a bunch of mail merge fields, and then you merge your data and optionally export as a PDF file. I realise that it is a commercial product with a cost attached but for us it has worked like a charm for 4 years.

Junto
A: 

So essentially it is impossible to convert a word document without having word installed on the machine?

pwee167
A: 

Hi, I'm using Xml2PDF Workstation. And I think it is very good. I've read all your problems and Xml2PDF copes with all of them and converts my documents created in .doc format to PDF without any problems. Footer. page numbers, tables, images all is converted very well. You may try it. Good luck.

Simon
A: 

I think you should check out PDFTechLib from PDF Technologies Inc. Their PDF component works under restricted environments. We found that they even allow to pull fonts from an alternate directory because restricted environments do not allow access to the fonts directory. Check out PDFTechLib at www.pdf-technologies.com

Sally