views:

420

answers:

3

I have some HTML files that I need to distribute in MS Word doc format (don't ask!). I can manually open each in OpenOffice and then save as a doc file. But I have quite a few files so I want to automate this. Do you know a way?

+2  A: 

Havn't tested but there is pyuno package to access OpenOffice API and following program show some ways to do like that.

ooextract.py

A command line tool, that extracts the text, html or pdf content from a StarWriter document and writes it to a different file or (optionally) prints it to stdout (grep your office documents).

S.Mark
looks neat - I will try it out
Plumo
+1  A: 

Abiword can convert files from the command line.

I haven't personally tried it to convert HTML to DOC, but since it supports both those formats, it seems like it's worth a try.

Also, would RTF be good enough? There are lots of converters for HTML->RTF.

Eric Seppanen
seems abiword can render to RTF but not doc from command line
Plumo
A: 

it's so annoying ... we are trying to build a docx converter for http://www.cbcjobs.com yet we can't find a docx one for command line

serge