docx

Determine if the document is DOC or DOCX in Java app without knowing its extension

There is a constraint in the content management system that requires to store all word documents with specific extension (different from DOC or DOCX). However, when outputting the document to user we need to know if it is a DOC or DOCX file in order to provide the right MIME type. So, is there a way to programatically find out if docume...

How to search docx for a specific string and return the outline number

I am using this as a starting base. I can search the document but am unable to reference what section it is from, is there something built into the OpenXML API to do this? For example lets say the the document is 500 pages and it is in a standard numbering schema 1. 1.1 1.1.1 1.1.1.1 searchWord 1.1.1.2 23.1.2.3.4. sea...

Is their a way to convert DocX, OpenXml, or RTF to TextFlow in AS3?

Basically we want to be able to open up a docx file in as3 or Flex 4 and convert it to a text flow while preserving formatting, embedded images, tables, columns, etc. I know theorectically it's possible as the new Text Layout Framework is powerful enough to pull it off, but I haven't been able to find any case where someone has achieve...

PHP OOXML Libraries?

A customer is asking me to build a module for his running webapp that can load docx files and extract data based on the Headings found in the document. I know docx is just a zip file and most of what I need can be found in word/document.xml, though I'm not looking forward to parsing lists/styles/images/tables and whatever other things th...

How can I convert an .rtf or .doc document to LaTeX?

Unfortunately, I can't use rtf2latex2e because it says that DropUNIX "no longer supports the classic environment". I barely know what I'm doing otherwise, besides dropping my .rtf file onto the DropUNIX program. What else can I use? I don't mind which type of file it is I'm converting to LaTeX (.doc would also be OK, as long as it keep...

Version control for DOCX and PDF?

Hi, I've been playing around with git and hg lately and then suddenly it occurred to me that this kind of thing will be great for documents. I've a document which I edit in DOCX and export as PDF. I tried using both git and hg to version control it and turns out with hg you end up tracking only binary and diff-ing isn't meaningful. Alt...

Manipulating Microsoft Word Office 2007 .docx document from PHP

I need an option from within PHP to Manipulate .docx (Microsoft Office 2007) document. I need to: Read the internal text Convert to .html To view them inside a browser. To replace text. I know I can use Word Automation, creating a COM object of Microsoft Word, but it's too slow, unstable and I have to have it installed on the server...

converting docx files to WPF flow documents

Hello, I have a word document with a few pictures and many lines of text (with headings) in it and want to convert this document to a flow document. So I can open it easily in my wpf application and use a rich text box to make it editable... I found this article useful. But I have some problems with adding images (Form the docx file) t...

Inserting/embedding a file to a DOCX file - determining file-type

Let's say I have an API which lets clients insert a file into a DOCX template, they just pass a path to the file. If it's an image I just want to add an image to the file, if it's something life RTF/DOC then I can maybe embed it so the content is visible, if it's some other type like MP3 or PDF then it might simply embed with an icon. I...

CSharp to replace strings of text in a docx

Using C#, is there a good way to find and replace a text string in a docx file without having word installed on that machine? ...

OpenOffice.org: Using UNO to convert docx to html

Testing on both Debian and Mac OSX On Debian, openoffice.org-writer package is installed Using latest OpenOffice.org version: 3.2.1 Tried both unoconv and JODConverter I start by launching a headless OpenOffice instance: soffice -headless -nofirststartwizard -accept="socket,host=localhost,port=8100;urp;" I'm in the right directory ...

Is there a dev kit/lib (written in c or c++) to write docx files?

Is there a dev kit/lib (written in c or c++) to write docx files? Microsoft has a dev kit, but it's written in C#. ...

docx "File is corrupt" error in Microsoft Word

I wrote a program, which open docx package and changes some <w:t>-text in "word/document.xml". When i open new generated docx in Microsoft word, it gives me an error — "file is corrupted". But if look in "Open XML SDK Tool" diffs between template docx and result docx files — there is only two line changed in "word/document.xml". Look at ...

Substituting one node for another

I'm using docx4j, which uses JAXB. I want to change a Text node to be a Paragraph node (or delete/add in the same place). This may be more of a JAXB question .. I'm not familiar with it. I can find the relevant Text node using MainDocumentPart.getJAXBNodesViaXpath() Looking at samples, I can see how to create a new paragraph node....

Problem saving edited zip archive (docx)

So here is my code: <?php $zip = new ZipArchive; if ($zip->open('test.docx') === TRUE) { $xmlString = $zip->getFromName('word/document.xml'); $xmlString = str_replace('$FIRST_AND_LAST_NAME', 'John Doe', $xmlString); $zip->addFromString('word/document.xml', $xmlString); echo 'ok'; $zip->close(); } else { echo 'failed';...

Alternative to Phplivedocx which can change images dynamically in template

Hello. For my web-application i use phplivedocx for docx/pdf generating reports. But now i need to insert images, that user will upload to my web-site, to the docx template. On the phplivedocx forums i found that this software doesn't support this feature. Is there any other tools, that can produce docx/pdf pages from template, and can a...

How to dynamically insert images to docx template?

Hi. In my web-application i'm using phplivedocx for text changing. But i also need to dynamically change images in my docx template. What tool do you recommend? Thanks in advance. ...

How to Extract docx (word 2007 above) using apache POI

Hai, i'm using apche poi 3.6 I've already created some code.. XWPFDocument doc = new XWPFDocument(new FileInputStream(file)); wordxExtractor = new XWPFWordExtractor(doc); text = wordxExtractor.getText(); System.out.println("adding docx " + file); d.add(new Field("content", text, Field.Store.NO, Field...

display the content of MS Office 2007/2010 files on iPad

Hi all, how can we display the content of MS Office 2007/2010 files (specially .docx) in iOS 3.2 (ipad) ? thanks in advance. ...

How do I get an Ampersand symbol into Word 2007 .docx XML file?

I am generating a Word doc in xml based on customer input, and of course it blows up whenever an & is used. I tried replacing all instances of & with &amp;, but then &amp; literally shows up in my Word Doc. Here's my code: static String replace(String in) { String ampersand = "&(?![a-zA-Z][a-zA-Z][a-zA-Z]?[a-zA-Z]?;)...