How Do I get rid of Microsofts html code in a doc so it can be used easily on a website

I am using various versions of Microsoft Word.   The html code it creates is non standard and although I can fix it by using Dreamweaver commands / clean up word html and then revert back to text.  Copy and paste that directly in to a web page, works a treat.   But, not everyone has Dreamweaver.   How can I accomplish the same thing with some free software please?
hotweb99Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

John-Charles-HerzbergCommented:
Have you tried saving the word document as RTF and then copying the output to your HTML tool.
John-Charles-HerzbergCommented:
If you would like this might be a option:

http://word2cleanhtml.com

Word2cleanhtml cleans up HTML pasted from Word documents. It applies filters to fix various things that Microsoft Office puts in its HTML and gives you a well formatted result that you can paste directly into a web page or content editing system.

or this: http://www.textfixer.com/html/convert-word-to-html.php

Convert Word DOC to HTML
This free online word converter tool will take the contents of a doc or docx file and convert the word text into HTML code. It produces a much cleaner html code than the Microsoft Word software normally produces. This doc converter strips as many unnecessary styles and extra mark-up code as it can. It does not preserve images but it does preserve html links and other basic html formatting tags like bolding in the conversion process.

This pages uses what is referred to as a client side script which means that all the converting is done on your computer, the contents of the word document are not sent to my server so if confidentiality is a concern then this tool is an appropriate solution.

Word to HTML application

http://word-to-html.com

Converting Word documents to HTML never was this easy! Word-to-HTML is a peerless tool that will immediately boost Your productivity:
Generate clean HTML from any Word file
Convert .doc, .docx and .rtf files
Supports all existing versions of Office
Convert multiple .doc files at once
Preserves all data in a document including images, equations and diagrams
Works from command-line and scripts
Perfect support for documents with international characters
Produce clean, standard-compliant HTML output fit for further editing
Make your articles, essays, documentation and all kinds of paperwork web-ready with no effort
Cleanest output possible.
Scott FellDeveloper & EE ModeratorCommented:
You can use either http://www.tinymce.com/ or http://ckeditor.com/.    Both are free. You can upload to your site just as a static html editor.  They have the "Paste from word" feature but don't use it. Instead, use the, "Past as text".   That will create simple P or div tags around paragraphs.   It will strip out things like colored text, bold etc.

There is no good way to take out the bad code that MS generates other than starting from scratch.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Your Guide to Achieving IT Business Success

The IT Service Excellence Tool Kit has best practices to keep your clients happy and business booming. Inside, you’ll find everything you need to increase client satisfaction and retention, become more competitive, and increase your overall success.

Dave BaldwinFixer of ProblemsCommented:
Note that Microsoft Word (and Office) HTML is still oriented towards Printing, not meant for web pages.
hotweb99Author Commented:
http://ckeditor.com/ works like a dream, fantastic, will also try on the website too after I see how secure it is.   Thanks for your help.
BillDLCommented:
Exactly as stated by Dave Baldwin.

You can, however, strip out some of the Microsoft-centric crap if you use the File menu > Save As > "Web Page , Filtered (*.htm, *.html)"
Depending on the version of Word, this may only be available as a Word add-in.

Although I haven't ever done so, another option in Word that should go part of the way towards removing Word junk is to force it to use CSS for formatting the fonts.
Tools > Options  General tab > Web Options > Browser tab > "Rely on CSS for font formatting"
Tools menu > Templates and Add-Ins > Linked CSS > Add > browse for your *.CSS file(s)
The styles from the cascading style sheet will then appear in the Styles and Formatting task pane (Format menu > Styles and Formatting) and you may be able to quickly apply these to the file before saving as a web page.

Personally I would just open the Word document > Select All > Copy > paste into a free but pretty well featured HTML editor like Kompozer using the Edit > "Paste without formatting" option, then reformat and save out as a compliant HTML file.

I have found that if I open a Word 97-2003 *.doc file in the open source LibreOffice Writer application and then use File > Preview in Web Browser, that it stips out the Microsoft code and leaves more standard HTML code.  My installation is buggy when I use the File menu > Wizards > Web Page option, but the OpenOffice updates page is playing up so I can't update and test it.  This is the coding that will be output if I do a File > Save As > Web page.

Try out the utilities suggested by  John-Charles-Herzberg, Padas and hotweb99 though, because the experts have clearly researched and picked them out specifically for your needs.
Scott FellDeveloper & EE ModeratorCommented:
>after I see how secure

You will want to ALWAYS scrub your data input server side before you do any db inserts or updates.  If you rely on js validation on the client, you still need to do it on the server.  

If I was copying from ms word as a one time thing, I would also do as BillDL suggests and copy ms word to a plain text editor (not word pad) and then format from there.

It sounded like you need your users to do this and that is where these WYSIWYG's come in if they are used properly.  You can still paste directly form word without using the "paste as text" feature and all the bad stuff goes with it.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft Applications

From novice to tech pro — start learning today.