We need a Java library to replace strings in MS Word files.
Can anyone suggest?
We need a Java library to replace strings in MS Word files.
Can anyone suggest?
Try this one: http://www.dancrintea.ro/doc-to-pdf/
Besides replacing strings in ms word files can also: - read/write Excel files using simplified API like: getCell(x,y) and setCell(x,y,string) - hide Excel sheets(secondary calculations for example) - replace images in DOC, ODT and SXW files - and convert:
doc --> pdf, html, txt, rtf xls --> pdf, html, csv ppt --> pdf, swf
I would suggest the Apache POI library:
Looking more - it looks like it hasn't been kept up to date - Boo! It may be complete enough now to do what you need however.
I would take a look at the Apache POI project. This is what I have used to interact with MS documents in the past.
While there is MS Word support in Apache POI, it is not very good. Loading and then saving any file with other than the most basic formatting will likely garble the layout. You should try it out though, maybe it works for you.
There are a number of commercial libraries as well, but I don't know if any of them are any better.
The crappy "solution" I had to settle for when working on a similar requirement recently was using the DOCX format, opening the ZIP container, reading the document XML, and then replacing my markers with the right texts. This does work for replacing simple bits of text without paragraphs etc.
private static final String WORD_TEMPLATE_PATH = "word/word_template.docx";
private static final String DOCUMENT_XML = "word/document.xml";
/*....*/
final Resource templateFile = new ClassPathResource(WORD_TEMPLATE_PATH);
final ZipInputStream zipIn = new ZipInputStream(templateFile.getInputStream());
final ZipOutputStream zipOut = new ZipOutputStream(output);
ZipEntry inEntry;
while ((inEntry = zipIn.getNextEntry()) != null) {
final ZipEntry outEntry = new ZipEntry(inEntry.getName());
zipOut.putNextEntry(outEntry);
if (inEntry.getName().equals(DOCUMENT_XML)) {
final String contentIn = IOUtils.toString(zipIn, UTF_8);
final String outContent = this.processContent(new StringReader(contentIn));
IOUtils.write(outContent, zipOut, UTF_8);
} else {
IOUtils.copy(zipIn, zipOut);
}
zipOut.closeEntry();
}
zipIn.close();
zipOut.finish();
I'm not proud of it, but it works.
Thanks all. I am gonna try http://www.dancrintea.ro/doc-to-pdf/
because I need to convert classic DOC file(binary) and not DOCX(zip format).
You can use Aspose.Words for Java.
The code examples for the replace methods are [here][2].
[2]: http://www.aspose.com/documentation/.net-components/aspose.words-for-.net-and-java/com/aspose/words/range.html#replace(java.lang.String, java.lang.String, boolean, boolean)