tags:

views:

25

answers:

1

Hi all,

I would like to merge multiple doc or rtf files into a single file which should be the same format of multiple files. What I mean is that if a user selects multiple rtf template files from a list box and clicks on a button on web page, the output should be a single rtf file which combines multiple rtf template files, I should use php for this.

I haven't decided the format of template files, but it should be either rtf or doc, and also I assume that template file has some images as well.

I have spent many hours to research the library for this, but still can't find it out.

Please help me out here!! :(

Thanks in advance.

A: 

I've been working on a similar project and havne't managed to find any PHP (or any other open source language) libraries for manipulating MSWord files. The way I approach it is kind of complicated, but works. Here's how I would do it (assuming you have a Linux server):

Setup:

  1. Install JODConverter and OpenOffice
  2. Start open office as a server (see http://www.artofsolving.com/node/10)

Approach (ie. what to do in your PHP code):

  1. Convert your MSWord or RTF files into ODT format by calling JODConverter via backticks or exec()
  2. Unzip each file into a temporary directory of its own
  3. Read the contents.xml file from each unzipped document using a DOM Parser
  4. Extract the <office:text> contents from each, and concatenate
  5. Put this concatenated xml back into the right spot in one of the content.xml files
  6. Re-zip the contents of that temporary directory and give it an .odt extension
  7. Use JODConverter to convert this file back to MSWord again

As I said, it's not pretty, but it does the job.

If you're looking to go down the RTF route, this question may also help: http://stackoverflow.com/questions/153801/concatenate-rtf-files-in-php-regex

Horatio Alderaan