Is there a way to export a simple html page to Word (.doc format, not .docx) without having Microsoft Word installed?
Well, there are many third party tools for this. I don't know if it gets any simpler than that.
Examples:
- http://htmltortf.com/
- http://www.brothersoft.com/windows-html-to-word-2008-56150.html
- http://www.eprintdriver.com/to_word/HTML_to_Word_Doc.html
Also found a vbscribt, but I'm guessing that requires that you have word installed.
If you have only simple HTML pages as you said, it can be opened with Word.
Otherwise there are some libraries which can do this, but I don't have experience with them.
My last idea is that if you are using ASP.NET, try to add application/msword
to the header and you can save it as a Word document (it won't be a real Word doc, only an HTML renamed to doc to be able to open).
I presume from the "C#" tag you wish to achieve this programmatically.
While it is possible to make a ".doc" Microsoft Word file, it would probably be easier and more portable to make a ".rtf" file.
If it's just HTML, all you need to do is change the extension to .doc and word will open it as if it's a word document. However, if there are images to include or javascript to run it can get a little more complicated.
There's a tool called JODConverter which hooks into open office to expose it's file format converters, there's versions available as a webapp (sits in tomcat) which you post to and a command line tool. I've been firing html at it and converting to .doc and pdf succesfully it's in a fairly big project, haven't gone live yet but I think I'm going to be using it. http://sourceforge.net/projects/jodconverter/
We’re are using the SautinSoft's HTML -> Word library to convert some Html data to Word for compatibility with our application. Their component converted 6,136,940 database rows total in 3hrs 45mins, good show.
This is the .Net component to convert HTML to RTF in C#, it names HTML-to-RTF Pro DLL .Net.
It works without having MS Office.
Code sample:
SautinSoft.HtmlToRtf.Converter obj = new SautinSoft.HtmlToRtf.Converter();
obj.PreseveImages = true; //images will be embedded in RTF document
obj.ConvertFile(@"d:\Web.htm", @"d:\Web.rtf");
*SautinSoft.HtmlToRtf.Converter obj = new SautinSoft.HtmlToRtf.Converter(); obj.PreseveImages = true; //images will be embedded in RTF document obj.ConvertFile(@"d:\Web.htm", @"d:\Web.rtf");*