views:

204

answers:

2

Hi

In our project we work a lot with both HTML and MS Word. The users create "documents" in their browsers and when they are finished they export these documents to MS Word using the DocX library (http://docx.codeplex.com/). This works fine when we only handle text.

What we want to do now is to let the user format the text that is entered in the browser. This is easy to implement using any of the WYSIWYG browser editors. The problem is that we want to take the styled HTML and export this to word as well.

I have seen commercial components that claims to be able to convert HTML to RTF so I thought that maybe this could solve it but I am waiting for a response if DocX supports RTF text. The best solution would be to convert HTML directly to the DocX format but I have only seen ASPOSE with this functionality and ASPOSE is really expensive.

Does anyone have any idea of how to solve this? How can I get my HTML to a docx file?

Thanks!

A: 

This is somewhat ugly (considering the resources) but it's an option: Batch CommandLine FileConversion with OpenOffice. It should be able to convert from HTML -> Doc (which then DocX might be able to process).

soffice.exe -headless -nologo -norestore -accept=socket,host=localhost,port=8100;urp;StarOffice.ServiceManager 
python DocumentConverter.py test.html test.doc
Marcel J.
Thanks for the suggestion but it is not really what I was looking for.
A: 

Would Aspose be all that expensive versus your time to find a framework to accomplish what you want, test, and deploy the solution? Years back we used XSLT to create RTF documents, but had Aspose been around then I would have selected just based on the time it would have saved me.

David Robbins
Aspose is definitely more expensive than I had planned for this project so I am hoping I can find a cheaper solution somehow. But of course it might be required in the end.