views:

195

answers:

2

I've been taken on board to work on a PHP based web application. One part of the application generates thumbnail images for MS Office documents on demand, and it uses MS Office + the VeryPDF docprint utility to do this. Because of this one requirement, the system is running on Windows Server 2003 + IIS.

I would prefer to have the system running on a Linux server, rather than MS, as I have far more experience in administering Linux systems than Windows and we have no other in-house technical staff.

Does anyone know a way to handle the document conversion using native Linux software? I would love something PHP native, but am willing to look outside that if necessary.

Thanks for your suggestions.

+4  A: 

I have never done anything like this, so I'm just throwing an idea off the top of my head.

Have you thought about utilizing Open Office's capabilities to create thumbnail images? I know OO saves thumbnail images within a created document, so all you need to do is extract the image to display it. (This is demonstrated on the Ubuntu forums.) You could always do something sort of "hackish" where you use run a file through OpenOffice and extract the image to display a small thumbnail.

Again, I have no idea how well this will work, but it may be worth a shot.

JasCav
+1 seconded. OpenOffice has a very good DOC parser. If all else fails, have OpenOffice generate a PDF out of the DOC, and process that to a thumbnail afterwards.
Pekka
OpenOffice is pretty good, but it is very slow. We already generate the thumbnails from PDFs, so if I can convert them to PDF it's all good from there.As far as I can see, there is an unofficial PHP extension which would allow interaction with OpenOffice, which would be left running in "headless" mode, so that would get rid of the startup overhead.I'm not sure about the extension, but the other option is to access it from the command line via Python (for which there are better bindings).I'll have a play with it and see how it goes, will post results back here.
El Yobo
A: 

To anyone else who comes across this, I have ended up going with the newer version of jodconverter. The sample code includes a basic web page that can be POSTed to using something like Pear's HTTP_Request2. A sample class (by yours truly) which uses this is mentioned in the comments in jodconverter's group on google code.

El Yobo
Also worth noting - newer versions of OpenOffice have had significant improvements in performance and compatibility, which make this more workable than it was in the past.
El Yobo