What is the best way to convert word HTML to word XML? I cannot buy a tool so need something preferably XSLT which is free and works suitably with basic formatting like paragraphs, lists, bold and italic.
XSLT on its own won't do you any good if you want to retain any formatting from outside the XHTML file (for example, in external style sheets). Besides, Word has the ability to open (X)HTML files, and has for a while. It might not come out looking as good as the original, but it works.
In fact, if you have Word and some skill with VB Script, I believe that it is possible to write a script that opens a (X)HTML file, then saves it as WordML or plain old Word if you're using Word 2003 or older, or as .docx if you have 2007.
Stephane Bouillon wrote a blog about this over on MSDN. She supplies a pretty good xslt transform that will do the job. It is designed for use with InfoPath, and only supports the XHTML tags InfoPath will produce so you may need to modify it for your specific application. But it seems to work pretty well and should give you a starting point to work off of.