A customer is asking me to build a module for his running webapp that can load docx files and extract data based on the Headings found in the document. I know docx is just a zip file and most of what I need can be found in word/document.xml, though I'm not looking forward to parsing lists/styles/images/tables and whatever other things that need to be translated from OOXML to HTML.
Are there any PHP libraries for this format? I do need some sort of flexibility though: just an OOXML to HTML converter is not going to cut it, I need to break the document up in parts.