views:

839

answers:

6

I would like to create a word document using a template, replace some variables (fields) and save it as a new word document.

I was thinking using Apache POI, http://poi.apache.org/ is it the best for this purpose? can you share your impression from it?

+1  A: 

I'm not sure of the exact status of the Word documents support in POI but, according to the POI website, work is still in progress (can't say what this mean exactly). So, at this time, I would not use POI but rather try to generate a RTF document. For this, you could :

  • Use RTFTemplate which is a RTF to RTF Engine that can generate RTF document as the result of the merge of a RTF model and data.
  • Use iText which is primarly a PDF generator but can also generate RTF.
  • Build your own custom solution (but I wouldn't do that).

I'd go for iText.

Pascal Thivent
+2  A: 

If you use a template, and do not want to create the word document from scratch, for what I know, POI is a pretty good solution. You open the template and select the zones you want to replace.

They say POI is still is developpement, but I've been using it in production environnement and it works pretty good at the moment.

Valentin Rocher
Good to know. Thanks.
Pascal Thivent
@Valentin Rocher The issue That i have is: my word Template **has a header which needs to be edited**. And As far as i know, POI does not allow me edit The header.
Arthur Ronald F D Garcia
+3  A: 

I've worked with POI before and it's certainly able to generate Word documents. But the devil is in the details.

Word has thousands of features: You can put numbered lists starting at #13 with negative indents into two joined cells of a table included in another table that is itself part of a bullet list... you get the idea. When the POI documentation says they are a work in progress, that reflects what will probably be an eternal state of trying to catch up to the (to us, undocumented) specification of Word.

Documents with a reasonably "normal" set of used features are well supported by POI, whose interfaces and methods are reasonable and consistent but sometimes require a bit of work. But as Pascal says, documents with a not too exorbitant set of features are also supported by RTF. I have almost no experience "doing" RTF but it's probably a bit simpler than working with POI.

If you're working in an environment or for a customer who insists that your produced documents be .DOC rather than .RTF, then POI is pretty much your only choice, unless you can introduce a step where you use a bit of Office automation to convert RTF into DOC.

Update: I've had a couple more ideas in the meantime.

Using POI or creating RTF documents is something that you could do on practically any platform. At my job, all servers doing processing like this happen to be running Linux, for example.

However, in the likely case that your programs will run under Windows, there is another alternative: Jacob http://www.land-of-kain.de/docs/jacob/

Jacob is a COM interface for Java; it essentially allows you to "remote control" Windows programs such as Word and Excel. The document I linked to above is not to Jacob's own site but to a single page with "cookie cutter" recipes for using Jacob. The project itself is on SourceForge: http://sourceforge.net/projects/jacob-project/ But people claim, and rightly so, that the documentation is a bit lacking.

Jacob has the advantage over all other solutions that you're dealing with the "real" Word and therefore all capabilities of Word are available to you. This would be an alternative if there are detail aspects of your document that just can't be handled with POI or via the RTF format.

Carl Smotricz
+1 Thanks for that very interesting feedback.
Pascal Thivent
A: 

You should look into the Aspose.Words components. They have recently begun providing a Java version of the component.

See the following link: Aspose.Word for Java

This supports Word automation, creation and advanced features such as mail merging without the need for an instance of Microsoft Word on the machine. The real benefits are that you are able to work within the context of an actual word document and not having to compromise by creating RTFs etc.

The Java version is not currently as fully featured as the .Net version but the main core functionality is there and they are pushing very hard to have a feature equivalent version soon.

Also, if you purchase the Java version you get a years free upgrades / support as the new releases are created.

Brian Scott
+1  A: 

If you are working with docx documents, docx4j is an option. Like POI, its open source.

plutext
+1  A: 

I created and use this: http://code.google.com/p/java2word

Leonardo
@Leonardo Congratulations. I will try it this week.
Arthur Ronald F D Garcia