I have MS Word files that I am converting to html using MS Word "Save As" feature. I have tried changing the options of "Save As" feature to do the saving as UTF-8, but Word does a lousy job of it. Therefore, I just save in 'windows-1252', and then try to convert either using Java or by opening the file in Textpad, and then saving in UTF-8. The result is usable, but still has a lot of issues in terms of funny characters showing. My question is if there is any tool that can convert an Html document in 'windows-1252' cleanly to UTF-8,
Or in general, do you know of any better process to turn Word files into Html?