views:

183

answers:

1

I've got a customer who managed it to paste WordprocessingML content into our application. As far as I know it was a direct copy&paste from Word 2000 to our Java application. I tried every Word and Java Version combination, but I can't reproduce this behavior - especially, since our application filters for HTML and text/plain.

I'm pretty sure that the older Office version had there own clipboards and exported only the formats, which should be available to other programms. Every office version I know(except maybe 2007) exports HTML, RTF and Plain.

Is there any way to get a WordprocesingML content into the clipboard and maybe to get Java to mix-up the data flavours

A: 

Apache POI is a Java API To Access Microsoft Format Files. HWPF is its part for reading and writing MS Word files. Apache TIKA is a toolkit for detecting and extracting metadata and structured text content from various documents using existing parser libraries. It also gives some support for MS Word documents. I suggest you see if they fit your use case.

Yuval F