tags:

views:

157

answers:

3

I want to dynamically load (AJAX) the text from some Microsoft Word files into a webpage. So I might have a link to essays I've written and upon mouseover have it load the first few sentences in a tooltip.

+5  A: 

Only if you have a parser. I think the new format is a zip archive with XML schema. But the old one is just binary.

There are some parsers out there.

I know of wvWare but it seems it's outdated. (http://wvware.sourceforge.net/)

This is maybe something worth looking at: http://poi.apache.org/hwpf/index.html

And yeah, forgot to mention how to do this. :-) First you need to make the javascript ask for the data through ajax. The serverside has to take care of the parsing and return the text to the javascript. This will be a pain in the ass. I haven't done this myself and have never tried the parsers I linked, so I'm not sure if they suit you. Images, stylesheets, etc.... not sure if that will be useable.

At least, good luck.

Kristinn Örn Sigurðsson
A: 

For security reasons, it is not possible to directly load a local file (such as a Word document) into the page using simply Javascript. The user will need to upload the file to the server, which you will want to parse on the server and then you can load whatever result you like into the page using Ajax.

Allen Pike
Well. You can do that through java applets or maybe flash (not sure about flash though). You basically have to have access to the client computer's memory.
Kristinn Örn Sigurðsson
A: 

It sounds like you mean to upload your files (e.g. essays) to your server to allow users to download them, and want to create a server-side page that will parse the files and print the first few lines (so it can be called by an AJAX method that displays a preview on hover).

To suggest a tool for this, we'll need to know whether these are "old" Word format (Office 2003 - extension is .doc) or "new" Word format (Office 2007 - extension is .docx).

It will also be good to know what you're using to create your pages server-side, since different document-reading tools support different programming languages. If you're using Java to read .doc files, you can use the tool we use at my place of work, which is POI (http://poi.apache.org/). If you're using something else, try searching google for {read in }, e.g. {read .docx in ruby}.

If all of this is Greek to you and you have no prior experience with developing custom server-side web code, this is probably going to be unnecessarily painful and you should consider an alternative (like manually creating a 3-line text "preview" page for each regular page, and then just showing that).

Arkaaito