How do I get an MS-Word .doc in to .html format?
Posted on 2011-05-02
I want to convert MS-Word .doc files in to HTML documents that render exactly what you would see if did 'print preview' in Word on the original .doc version.
If I open the doc in Word and use the 'save as webpage' option then view the output in Firefox I just see the main body of each page, fine, but no page headers or page footers.
I've looked around for a utility that will do such conversion but all have failed to show the page headers and page footers. If I use a utility I need it to be something I can run from a command prompt that accepts the input file name and output file name as parameters. Even the ones with a GUI failed to recognise the page headers,etc.
The only solution that I can come up with is using a virtual printer called Print2eDoc from Gnostice which is installed as a printer under WinXP and 'prints' the Word print output to JPEG files, one for each page of the Word doc. Then I have to encapsulate these JPEGs in HTML before I can render in Firefox. This is messy because (a) any graphics in the Word doc degrade when put through the conversion (b) the text in the Word doc degrades as well. Very bad. There is also the hassle of building the HTML document myself.
How can I get the whole doc to appear in HTML?
Am I missing a setting in 'save as webpage' in Word?
Is there a much better utility which will capture Word print output to HTML?
Is there a much better conversion utility that does doc to HTML?
Is there a way of tweaking the HTML that 'save as webpage' produces so that the page headers/footers are rendered (the header/footer information is stored in a subfolder by the same name as the output file)?
Thank for all your help.
Big points because I need good answer quickly.