How to Cut & Paste Word file into interface and keep formatting?


I have a guy who is trying to save time by having me create an interface that will basically write webpages for him.

I have most of it figured out, using mostly solutions from ASP and SQL, except this one problem.

He gets articles through his email, in the form of Word Docs.  He currently pastes the word file into GoLive and has to go through and change the formatting back to the way it was.  He wants a way to cut and paste the contents of the word file, and keep all the formatting, such as bold, italic, alignment etc.

What do I write inside my interface, that when he inputs his Word content into the textarea of the form on the interface, that it keeps the formatting?  If there's nothing, are there any other ways to accomplish this?

I currently use primarily ASP, HTML, Javascript, CSS, and SQL for most of my work, but I am familiar with XML, PHP, and ASP.NET so I don't mind using one of these languages to do the job if I have to.

Also, trying to keep this as compatible as possible, as the guy uses a Powerbook and Safari as a browser.  If its HAS TO BE Windows and IE, then that's fine.

300 points up for grabs!
Who is Participating?
No COBAL is speaking of having WORD create the webpage for you.  
You can  'save as' a html page.
Microsoft puts its code in the 'background' of the html.  This extra code is not compatable with most browsers.

One way of cleaning this code is to have word save the doc as a html page.
Open the page in Dreamweaver
Dreamweaver has commands builtin that cleans up word code, you just have to tell it to clean it.

Hope that helps.

A text area cannot handle anything but 7-bit ASCII.  There is nothing you can do with cut and paste doc output, it won't even handle rtf correctly.  If you want to retain the word formatting then it has to be saved as HTML, in which case it will generate just about the worse code imaginable.  It weill not only not be cross browser, it won't even render very well in IE.

DrinkGreenAuthor Commented:
Well I notice that when you cut and paste it directly into Frontpage for example, it keeps the format.  But is this the horrible code you speak of?  It does look out of the ordinary.

Is there any other way this can be done?  PHP interface of some sorts?
Get your problem seen by more experts

Be seen. Boost your question’s priority for more expert views and faster solutions

Hi ,
Try cut and paste your word doc content directly into dreamweaver (DWMX2004). Dreamweaver handles import from word documents in a better way. Plus the 'Clean Word HTML' option which makes it all better.

You can use a content Editable div but the genrated HTML is dreadful (as pointed out by &CD). You can access the HTML by the innerHTML and submit it as part of a form to the server.

<input type="button" value="show html" onclick="alert(document.getElementById('t1').innerHTML)"><p>
<p>Copy, paste, edit or type in here</p>
<div contentEditable="true" id="t1" style="border:solid 2px;height=200;width=50%">
the above is IE only  (and Cd& not &CD, sorry)
save the word documet to a RTF (rich text) file then use a rtf converter to change it to HTML . Arachnophilia a html editor you can obtain here does a good job.
If you want to use a cut and paste option you will always get the worst code.  There is a way of achieving this if you don't care about compatability and bad, blaoated code.

You can use the DHTMLEdit box.

You first need to get the dhtmled.js file and inlude that in your code.

Then you need some script directly in your page.
function stripHTML(string) {
  // Remove BODY tags from DHTMLEdit box
  var ind1, ind2
  ind1 = string.indexOf("<BODY");
  ind1 = string.indexOf(">", ind1);
  ind1 ++;
  ind2 = string.indexOf("</BODY");
  if ((ind1 > 0) && (ind2 > 0)) {
    string = string.substring(ind1, ind2);
  return tidyUp(string);
function tidyUp(string) {
  // Change STRONG formating to B (strong to bold)
  while (string.indexOf("<STRONG>") >= 0)
  string = string.replace("<STRONG>","<B>");
  while (string.indexOf("</STRONG>") >= 0)
  string = string.replace("</STRONG>","</B>");
  return string;
<SCRIPT ID=clientEventHandlersJS LANGUAGE=javascript>
function window_onload() {
  if (document.eccedit.h1_PageCaption.value != null && document.eccedit.h1_PageCaption.value != "") {
    document.all.DHTMLEdit1.DocumentHTML = document.eccedit.h1_PageCaption.value;
  if (document.eccedit.h1_PageColumn1.value != null && document.eccedit.h1_PageColumn1.value != "") {
    document.all.DHTMLEdit2.DocumentHTML = document.eccedit.h1_PageColumn1.value;
function button3_onclick() {
onerror=displayError;   // Binds error event to "displayError" routine
function displayError(msg, url, line) {
  // Error handling routine
  alert("The following error occured:\n\n" + msg);
  return true;  // Suppresses Internet Explorer error

Then you need the edit box and a hidden textarea to get the code into a form.
<object classid="clsid:2D360201-FFF5-11D1-8D03-00A0C959BC0A" id="DHTMLEdit1" height="135" width="350" VIEWASTEXT><embed height="135" width="350"></embed></object>
<div style="display:none;"><textarea name="h1_PageCaption" title="Quote" cols="38" rows="62"><%=PageCaption%></textarea></div>

This is an IE only solution.
DrinkGreenAuthor Commented:
I'm still trying to test this idea out.  I have not abandoned it.  Trying to find best solution

Please answer to the experts above. Thanks

DrinkGreenAuthor Commented:
Thanks all for the patience and the help.  I had to work with this for quite awhile.  The best solution I found was just telling the guy to simply put it into Dreamweaver, and run the Clean Word HTML code.

As the first one to suggest this, I awarded rockmansattic the points.

Thanks again!
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.