tags:

views:

1368

answers:

3

In my web project, I use DocX file type for containing report template. I need to convert DocX file type to PDF. Do you have any .Net managed code for doing that?

I know several ways for solving this question. But it isn't managed code and free like the following items.

  • Word 12.0 Object Library To programmatically save a Word 2007 document as either a PDF document or an XPS document. But it requires installing Office 2007 on server.

  • Print by using some free PDF printer like PDFCreator. But I need some program for open DocX on server like Office 2007. It's very bad idea.

  • Convert by free convertor program. But result isn't perfect.

  • Use framework like XF Rendering Server. It's very good idea but it isn't free.

  • Create all document content in PDF Api like iTextSharp. But I must try hard for creating nice document.

  • Or create template document with other file format that can export it to DocX and PDF file format. Moreover, end-users should easily edit this file. If you know, please tell me.

Thanks,

A: 

Does Open Office have an API? that would at least be a free option?

Create the PDF using a Reporting Tool like ActiveReports/Crystal Reports (much easier than using iTextSharp)

Mark Redman
Yes. I can generate DocX document by Open Office API. But I can't convert it to PDF directly without commercial tool or Microsoft Office Interop. As I know, ActiveReport or Crystal Report doesn’t support dynamic report like importing report template from DocX document. PS. I need to use DocX because end-users easily edit it.
Soul_Master
I mean, can you import docx and export in Open Office?
Mark Redman
+2  A: 

Installing Office 2007 and using the Word 12 Object Library is definitely the option I'd go for (and have done so on some of my own projects).

If you don't want to install Word on a production web server, why not have it on a secondary server. You can get this second server to communicate with the first (using a web service or something like that) -- it could request the next Word document that needs exporting, do the conversion and then return the PDF data.

Let me know if you want a C# example of the Word automation that does this conversion (it's very trivial).

Adrian


Here's my code, posted for Jason. This works with Word 2007. You need to download and install the PDF exporter from the Office web site:

using Microsoft.Office.Interop.Word;

...

object _read_only = false;
object _visible = true;
object _false = false;
object _true = true;
object _dynamic = 2;
object  _missing = System.Reflection.Missing.Value;

object _htmlFormat = 8;        
object _pdfFormat = 17;
object _xpsFormat = 18;

object fileName = "C:\\Test.docx";

ApplicationClass ac = new ApplicationClass();
//ac.Visible = true; // Uncomment to see Word as it opens and converts the document
//ac.Activate();

Document d = ac.Documents.Open(ref fileName, ref _missing, ref _true, ref _read_only, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _visible, ref _missing, ref _missing, ref _missing, ref _missing);

object newFileName = ((string)fileName).Substring(0, ((string)fileName).LastIndexOf(".")) + ".pdf";

d.SaveAs(ref newFileName, ref _pdfFormat, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing, ref _missing);

d.Close(ref _false, ref _missing, ref _missing);
ac.Quit(ref _false, ref _missing, ref _missing);

ac = null;


So, Soul_Master, what you are saying is that you don't want to use interop (though you don't say why, which I'd be interested to know), you don't want to pay for a commercial exporter, and you want perfect results?

I cant help you, I'm afraid. Interop will give you perfect results, every time, and you already have the software. If you won't use that, you are going to have to make a sacrifice -- either cost or quality.

The example code would be nice to see. I have to agree with Joel Spolsky's suggestion to use Office to convert to/from office. (as flippant as ever http://www.joelonsoftware.com/items/2008/02/19.html )
Jason Harrison
It isn't my point. I don't care which server to be installed this component. The first thing that I need is Effective component for converting DocX to PDF. Word 2007 Interop isn't my final answer.
Soul_Master
Moreover, I just told about this answer in my question.
Soul_Master
+3  A: 

I do not have code for converting DocX to PDF, but it appears your requirement for DocX is not firm. Your last bullet says:

Or create template document with other file format that can export it to DocX and PDF file format. Moreover, end-users should easily edit this file. If you know, please tell me.

I read this to mean you want to be able to create a template document, fill it with data and covert it to PDF, yet allow the template to be maintained, right?

Solution: XSL-FO

XSL-FO is a W3C standard like HTML and can be transformed by a number of open source and commercial products into PDF, WordML, XPS, PS, PCL, SVG, TIFF, etc. I have used this to deliver hundreds of thousands of documents per month, both online as PDFs and offline (things like bulk check printing).

To get you started, here is the W3C page for XML-FO. There is a lot of good information there, including a list of software (both open source and commecial) down the left side. I have personally used two commercial products called IBEX PDF Creator and XEP by RenderX. Both are excellent products, and there is a 100% managed C# implementation to get to PDF called FO.NET up on CodePlex. I have not tried this, but it should satisfy your "free" criterion.

There are a number of ways you can edit the template for documents to be created in XSL-FO. Typically this template is XSLT that you apply to your XML data, but this is not a requirement. I have built these by hand, but it is a bit of a learning curve. You can start with a document in XSL-FO and fill in sections of it with code, just as you could HTML. The good news is that there are a number of XSL-FO editors out there. The bad news is that none I know of are free, but several of them are cheap and you may find something that meets the free criterion with a bit of Googling. However, one option is that you can convert from Word using a stylesheet (commercial & free).

Jerry Bullard
Does XSL-FO support embedded image in document?
Soul_Master
Absolutely. You can embed bitmaps using the external-graphic element. They are typically referenced by URL as in an HTML document. Most (if not all) XSL-FO renderers also support scalable vector graphics using svg in a document.Here is a link to a nice XSL-FO reference:http://www.w3schools.com/xslfo/xslfo_reference.asp
Jerry Bullard
Here is another commercial product that claims to convert from DocX to XML-FO:http://www.alt-soft.com/Products_Word2PDF.aspxAnd another free one that goes from RTF to XSL-FO:http://www.freedownloadscenter.com/Network_and_Internet/Web_Server_Components/RTF_TO_XML.htmlAnd another free one on SourceForge call html2fo that converts from HTML to XSL-FO:http://sourceforge.net/projects/html2fo/However, please keep in mind quality of the conversion will vary depending on the source and the transformation method.
Jerry Bullard
It may be an only one possible answer that matches all requirements. Thanks.
Soul_Master