Hai, i'm using apche poi 3.6 I've already created some code..
XWPFDocument doc = new XWPFDocument(new FileInputStream(file));
wordxExtractor = new XWPFWordExtractor(doc);
text = wordxExtractor.getText();
System.out.println("adding docx " + file);
d.add(new Field("content", text, Field.Store.NO, Field.Index.ANALYZED));
unfortunately, it generated error..
Exception in thread "main" java.lang.NoClassDefFoundError: org/dom4j/DocumentException
at org.apache.poi.openxml4j.opc.OPCPackage.init(OPCPackage.java:149) at org.apache.poi.openxml4j.opc.OPCPackage.(OPCPackage.java:136) at org.apache.poi.openxml4j.opc.Package.(Package.java:54) at org.apache.poi.openxml4j.opc.ZipPackage.(ZipPackage.java:98) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:199) at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:178) at org.apache.poi.util.PackageHelper.open(PackageHelper.java:53) at org.apache.poi.xwpf.usermodel.XWPFDocument.(XWPFDocument.java:98) at org.apache.lucene.demo.Indexer.indexDocs(Indexer.java:153) at org.apache.lucene.demo.Indexer.main(Indexer.java:88)
It seemed that it used Constructor
XWPFWordExtractor(OPCPackage container)
but not this one ->
XWPFWordExtractor(XWPFDocument document)
Any wondering why?? Or any idea how i can extract the docx then convert it into a String?? Thanks