views:

4263

answers:

8

Is there an easy way to create Word documents (.docx) in a Ruby application? Actually, in my case it's a Rails application served from a Linux server.

A gem similar to Prawn but for DOCX instead of PDF would be great!

A: 

I know if you serve a HTML document as a word document with the .doc extension, it will open in Word just fine. Just don't do anything fancy.

Edit: Here is an example using classic ASP. http://www.aspdev.org/asp/asp-export-word/

Daniel A. White
Thanks, but that sounds a bit like a dirty hack, doesn't it? :-) Besides that: What are the security concerns when using RTF?
Javier
What are the concerns with RTF files?
Brian
+3  A: 

The MS Word format (.doc) is proprietary. You'd be better off using rtf, which can be opened by word and is not proprietary.

Check out this gem.

Brian
FYI, RTF documents are really deprecated for security reasons, IMHO.
Daniel A. White
IMHO .doc doesn't seem very proprietary, as OpenOffice can build doc-files and .docx shouldn't be proprietary at all.
Javier
A: 

If you're running on Windows, of course, it's a matter of WIN32OLE and some pain with the Word COM objects.

Chances are that your serving from a *nix environment, though. Word 2007 uses the "Microsoft Office Open XML" format (*.docx) which can be opened using the appropriate compatibility pack from Microsoft.

Some of the more recent Office apps (2002/XP and 2003 at least) had their own XML formats which may also be useable.

I'm not aware of any Ruby tools to make the process easier, sadly.

If it can be made acceptable, I think I'd be inclined to go down the renamed-html file route. I just saved a document as HTML from WordXP, renamed it to a .doc and opened it without problem.

Mike Woodhouse
The renamed-html file route as you describe it wouldn't work for my case. I can't pre-build the html-files in an office word application and rename it to .doc and if I do this with plain html-files on my server they aren't recognized by IE as doc-files.
Javier
+8  A: 

As has been noted, there don't appear to be any libraries to manipulate Open XML documents in Ruby, but OpenXML Developer has complete documentation on the format of Open XML documents.

If what you want is to send a copy of a standard document (like a form letter) customized for each user, it should be fairly simple given that a DOCX is a ZIP file that contains various parts in a directory hierarchy. Have a DOCX "template" that contains all the parts and tree structure that you want to send to all users (with no real content), then simply create new (or modify existing) pieces that contain the user-specific content you want and inject it into the ZIP (DOCX file) before sending it to the user.

For example: You could have document-template.xml that contains Dear [USER-PLACEHOLDER]:. When a user requests the document, you replace [USER-PLACEHOLDER] with the user's name, then add the resulting document.xml to the your-template.docx ZIP file (which would contain all the images and other parts you want in the Word document) and send that resulting document to the user.

Note that if you rename a .docx file to .zip it is trivial to explore the structure and format of the parts inside. You can remove or replace images or other parts very easily with any ZIP manipulation tools or programmatically with code.

Generating a brand new Word document with completely custom content from raw XML would be very difficult without access to an API to make the job easier. If you really need to do that, you might consider installing Mono, then use VB.NET, C# or IronRuby to create your Open XML documents using the Open XML Format SDK 1.0. Since you would just be using the Microsoft.Office.DocumentFormat.OpenXml.Packaging Namespace to manipulate Open XML documents, it should work okay in Mono, which seems to support everything the SDK requires.

Grant Wagner
I've written a small utility for slicing up somewhat complex docx templates and building a custom document using the slices: http://github.com/bagilevi/docx_builder
Leventix
+3  A: 

You can use Apache POI. It is written in Java, but integrates with Ruby as an extension

ykaganovich
Thanks for your input! Do you know of any implementation where Apache POI was used to actually create a word document (not only parse it)?
Javier
Sorry, I don't know much about it other than it exists.
ykaganovich
+1  A: 

Further to Grant's answer, you can also send Word a "Flat OPC" file, which is essentially the docx unzipped and concatenated to create a single xml file. This way, you can replace [USER-PLACEHOLDER] in one file and be done with it (ie no zipping or unzipping).

plutext
+3  A: 

I've done something like this recently. Here's the blog post: http://tomasvarsavsky.com/2009/04/04/simple-word-document-templating-using-ruby-and-xml/

+1  A: 

SecurityFocus has a few examples of RTF vulns ...