views:

638

answers:

8

Hey all,

Just after a bit of advice on peoples preference with DOC to PDF libraries for .NET. At the moment we currently piggy back on OpenOffice but this isn't really ideal. What i'm after is a library that will allow me to convert a .DOC and write out a .PDF in code.

The library doesn't have to be free and it must also work with ASP.NET.

Thanks

+1  A: 

Hi Chalkey,

I'm using the ExpertPDF component. It's a pay component but until now it's paying itself.

It's very easy to handle and it's support to export from html to pdf is excellent.

Cheers

Daniel
Just had a quick glance - appears to be HTML to PDF not DOC to PDF?
Chalkey
You're right.... There is only a option to RTF to PDF... My bad.
Daniel
A: 

Of course, the easiest way to do it is to funnel it through OpenOffice, since OO has knowledge about the source file's format. Without that knowledge, you need something that can interpret the file format, so you are essentially reinventing OpenOffice.

Are you using a Print to PDF printer driver, such as CutePDF? The nice thing about them is that you can use them with any application. But of course, if you're trying to do this on a server you still need to use OpenOffice to print.

CutePDF (and others) have a programmer API so that you can automate the process. You should be cautious with some of them though; there can be concurrency issues if the API uses the registry to manipulate the Print dialog.

Robert Harvey
+3  A: 

2 years ago, I needed to find a solution for this as well. At that point, every component that I tried would produce inaccurate output in some documents that were critical for the business. But, the Word 2007 "Save as PDF" plugin would do the job properly.

So, I ended up writing a service that wrapped a multi-threaded automation of Word. A couple instances of Word were automated, converting documents and saving the results. The wrapper code would pick up the converted document and pass it along as the result.

There was some effort involved in this, but it was the only solution I could find that met all the business needs at that point.

John Fisher
This approach will definitely give the best PDF quality as currently Word is the application that can render .doc files best. However, it is tricky to get Word working in an unattended environment.
0xA3
We needed something similar - doc to tiff converter with very strict requirements to rendering approximation. After a long string of attempts to find a standalone library I ended up with the same - a wrapper around server-side (unattended) Word instance printing the supplied docs to PDF. Yes running Word (or any office app) unattended is tricky, but doable. Ours runs for several years now without any trouble
mfeingold
I think I saw an article about Word2007 PDF'ing on codeproject. Problem is we wont be able to install Word on the clients machine. We do something similar at the moment with a Windows service running that uses OpenOffice, the issue is every once in a while a rouge .DOC causes OpenOffice to hang which means one of us remoting on to restart it.
Chalkey
Be sure to check the run-time licence terms from Microsoft. Several years back, no Office application was allowed to run in a server environment. The licence terms may have changed since then...
Vijay Patel
+1  A: 

We've been through a number of different PDF generators and convertors over the years and found that the Aspose components provide lots of flexibility and trouble free use. If your application is built using ASP.NET then Aspose components are also Medium Trust friendly.

For more info see:

Aspose.Pdf for .NET
Aspose.Pdf for .NET - File Conversion

Kev
Thanks - I'll check them out.
Chalkey
+1  A: 

In the past I've had some success using the PDF X Change Drivers offered here. It's not the best website in the world but the technology is sound.

Although this won't help if you aren't planning to have a version of Office on the machine doing the converting as the trick here is to print from the application to the PDF driver which generates a file.

Richard
A: 

Nemo PDF is a good choice.

A: 

Hi Chalkey,

I'm guessing you may have found something already, but in case you haven't yet, see below. I should disclaim that I work for the company that makes the product I'm mentioning.

We offer a commercial SDK called EasyPDF SDK that converts Microsoft Word documents (as well as a number of other formats) to PDF. Check out pdfonline.com for a trial copy if you're interested. It's a printer driver based tool that requires a Microsoft Word install and is thread safe and widely used on servers. ASP.NET is supported. You shouldn't have issues with hanging conversions as there are timeouts you can set and exceptions you can catch in case a given conversion stalls for some reason.

The API is pretty high level and looks something like this for a basic server side ASP.NET conversion:

Loader oLoader = new Loader();

Printer oPrinter = (Printer)oLoader.LoadObject("easyPDF.Printer.6");

WordPrintJob oPrintJob = oPrinter.WordPrintJob; oPrintJob.PrintOut(input, output);

Cheers

yu-chen-pdfonline-com
Thanks for the reply - the only catch is the Microsoft Word install.
Chalkey
A: 

Unless you are after perfect conversion fidelity or need support for additional file formats then Aspose.words is probably the best way forward. They have some rendering problems such a ignoring footnotes, problems with text flow & floating objects, but no library - other than those based on MS-Word - will get you closer.

Having said that, if you are after perfect fidelity or you need support for additional file formats then have a look at a product I have worked on, which allows conversion of all major Office based formats via a web services based interface.

As some of the replies above indicate, writing your own is doable, but when it needs to run from a server in an unattended way on 32/64bit/Win2K8/Win2K3 systems then you could do a lot worse than the product described in this posting.

Muhimbi