How to Cut & Paste Word file into interface and keep formatting?

Posted on 2004-10-21
Last Modified: 2013-12-16

I have a guy who is trying to save time by having me create an interface that will basically write webpages for him.

I have most of it figured out, using mostly solutions from ASP and SQL, except this one problem.

He gets articles through his email, in the form of Word Docs.  He currently pastes the word file into GoLive and has to go through and change the formatting back to the way it was.  He wants a way to cut and paste the contents of the word file, and keep all the formatting, such as bold, italic, alignment etc.

What do I write inside my interface, that when he inputs his Word content into the textarea of the form on the interface, that it keeps the formatting?  If there's nothing, are there any other ways to accomplish this?

I currently use primarily ASP, HTML, Javascript, CSS, and SQL for most of my work, but I am familiar with XML, PHP, and ASP.NET so I don't mind using one of these languages to do the job if I have to.

Also, trying to keep this as compatible as possible, as the guy uses a Powerbook and Safari as a browser.  If its HAS TO BE Windows and IE, then that's fine.

300 points up for grabs!
Question by:DrinkGreen
    LVL 53

    Expert Comment

    A text area cannot handle anything but 7-bit ASCII.  There is nothing you can do with cut and paste doc output, it won't even handle rtf correctly.  If you want to retain the word formatting then it has to be saved as HTML, in which case it will generate just about the worse code imaginable.  It weill not only not be cross browser, it won't even render very well in IE.


    Author Comment

    Well I notice that when you cut and paste it directly into Frontpage for example, it keeps the format.  But is this the horrible code you speak of?  It does look out of the ordinary.

    Is there any other way this can be done?  PHP interface of some sorts?
    LVL 10

    Accepted Solution

    No COBAL is speaking of having WORD create the webpage for you.  
    You can  'save as' a html page.
    Microsoft puts its code in the 'background' of the html.  This extra code is not compatable with most browsers.

    One way of cleaning this code is to have word save the doc as a html page.
    Open the page in Dreamweaver
    Dreamweaver has commands builtin that cleans up word code, you just have to tell it to clean it.

    Hope that helps.

    LVL 2

    Expert Comment

    Hi ,
    Try cut and paste your word doc content directly into dreamweaver (DWMX2004). Dreamweaver handles import from word documents in a better way. Plus the 'Clean Word HTML' option which makes it all better.

    LVL 31

    Expert Comment

    You can use a content Editable div but the genrated HTML is dreadful (as pointed out by &CD). You can access the HTML by the innerHTML and submit it as part of a form to the server.

    <input type="button" value="show html" onclick="alert(document.getElementById('t1').innerHTML)"><p>
    <p>Copy, paste, edit or type in here</p>
    <div contentEditable="true" id="t1" style="border:solid 2px;height=200;width=50%">
    LVL 31

    Expert Comment

    the above is IE only  (and Cd& not &CD, sorry)
    LVL 2

    Expert Comment

    save the word documet to a RTF (rich text) file then use a rtf converter to change it to HTML . Arachnophilia a html editor you can obtain here does a good job.
    LVL 10

    Expert Comment

    If you want to use a cut and paste option you will always get the worst code.  There is a way of achieving this if you don't care about compatability and bad, blaoated code.

    You can use the DHTMLEdit box.

    You first need to get the dhtmled.js file and inlude that in your code.

    Then you need some script directly in your page.
    function stripHTML(string) {
      // Remove BODY tags from DHTMLEdit box
      var ind1, ind2
      ind1 = string.indexOf("<BODY");
      ind1 = string.indexOf(">", ind1);
      ind1 ++;
      ind2 = string.indexOf("</BODY");
      if ((ind1 > 0) && (ind2 > 0)) {
        string = string.substring(ind1, ind2);
      return tidyUp(string);
    function tidyUp(string) {
      // Change STRONG formating to B (strong to bold)
      while (string.indexOf("<STRONG>") >= 0)
      string = string.replace("<STRONG>","<B>");
      while (string.indexOf("</STRONG>") >= 0)
      string = string.replace("</STRONG>","</B>");
      return string;
    <SCRIPT ID=clientEventHandlersJS LANGUAGE=javascript>
    function window_onload() {
      if (document.eccedit.h1_PageCaption.value != null && document.eccedit.h1_PageCaption.value != "") {
        document.all.DHTMLEdit1.DocumentHTML = document.eccedit.h1_PageCaption.value;
      if (document.eccedit.h1_PageColumn1.value != null && document.eccedit.h1_PageColumn1.value != "") {
        document.all.DHTMLEdit2.DocumentHTML = document.eccedit.h1_PageColumn1.value;
    function button3_onclick() {
    onerror=displayError;   // Binds error event to "displayError" routine
    function displayError(msg, url, line) {
      // Error handling routine
      alert("The following error occured:\n\n" + msg);
      return true;  // Suppresses Internet Explorer error

    Then you need the edit box and a hidden textarea to get the code into a form.
    <object classid="clsid:2D360201-FFF5-11D1-8D03-00A0C959BC0A" id="DHTMLEdit1" height="135" width="350" VIEWASTEXT><embed height="135" width="350"></embed></object>
    <div style="display:none;"><textarea name="h1_PageCaption" title="Quote" cols="38" rows="62"><%=PageCaption%></textarea></div>

    This is an IE only solution.

    Author Comment

    I'm still trying to test this idea out.  I have not abandoned it.  Trying to find best solution

    LVL 20

    Expert Comment

    Please answer to the experts above. Thanks


    Author Comment

    Thanks all for the patience and the help.  I had to work with this for quite awhile.  The best solution I found was just telling the guy to simply put it into Dreamweaver, and run the Clean Word HTML code.

    As the first one to suggest this, I awarded rockmansattic the points.

    Thanks again!

    Write Comment

    Please enter a first name

    Please enter a last name

    We will never share this with anyone.

    Featured Post

    What Should I Do With This Threat Intelligence?

    Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

    Introduction In this tutorial, I'll explain how to create an animated progress meter in a wireframe prototype developed using Axure RP 7.0 - a leading prototyping tool for designing web sites and software. (For more information about Axure and gett…
    Introduction Knockoutjs (Knockout) is a JavaScript framework (Model View ViewModel or MVVM framework).   The main ideology behind Knockout is to control from JavaScript how a page looks whilst creating an engaging user experience in the least …
    The purpose of this video is to demonstrate how to prevent comment spam on a WordPress Website. This will be demonstrated using a Windows 8 PC. Plugin Akismet will be used. Go to your WordPress login page. This will look like the following: myw…
    HTML5 has deprecated a few of the older ways of showing media as well as offering up a new way to create games and animations. Audio, video, and canvas are just a few of the adjustments made between XHTML and HTML5. As we learned in our last micr…

    934 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    18 Experts available now in Live!

    Get 1:1 Help Now