Link to home
Start Free TrialLog in
Avatar of SAbboushi
SAbboushiFlag for United States of America

asked on

copy text from webpage to text editor in usable format

I'm getting frustrated -- when I want to copy code from a webpage to my text editor, I often have to spend a lot of time reformatting it because the page author did not embed line terminators

e.g. copying from one of the sections (e.g. htmt or css or script) at http://www.queness.com/post/77/simple-jquery-modal-window-tutorial#login

What is the easiest way to copy such text to a text editor so that the html is converted and formatted the way it appears on the web page?
Avatar of Scott Fell
Scott Fell
Flag of United States of America image

Do you mean the minified js code?  That is done purposely to keep the file size down.  Typically, there is no need for the user to view it.   You can expand it with something like http://jsbeautifier.org/.   If you are trying to save the the actual page into some form you can view later off line, try to Save As in your browser and select html or full web page.  You can then open up that file in word and it may look similar to the page.  But just know the code used to show on screen is not typically meant to be saved offline in another editor.
Avatar of SAbboushi

ASKER

Thanks for your input - no, I'm not referring to minified js code.

>> just know the code used to show on screen is not typically meant to be saved offline in another editor.

Agreed - but my question remains:

What is the easiest way to copy such text to a text editor so that the html is converted and formatted the way it appears on the web page?  

I am talking about what I select, not the whole page -- see the link to get an idea of what I am after
I'm not sure what to look at on the link.  Many sites today use multiple style sheets and javascript and account for different browsers with different code.   So what you are asking is difficult to just copy a portion and make it look like what you see.  

In that link there is a heading on the page for INTRODUCTION that contains bullets.  If you copy and paste to word for example, you will get round bullets not the images. This is because you are just copying the html code, not all the css, images and js. Word will render the html <ul><li></li></ul> as a unordered list that will have a default bullet.  

You can try and Save As in your browser and choose "Web page complete". If you have adobe acrobate, you can create a pdf from a url, then just take what you need.  You can try selecting and copying the page, then past into work. (gets close but not always 100%)

You can try to take a screen print and use OCR software / acrobate to convert the text.
SAbboushi--Have you tried copying and then using Paste Special to paste in Word.  (File|Paste Special)

http://www.personal-computer-tutor.com/pastespecial.htm

http://projects.gnome.org/gnumeric/doc/sect-movecopy-pastespecial.shtml
My apologies... I was not clear enough...

On that page there are 3 sections:
1. HTML code and A tag attributes
2. CSS code
3. Javascript

Each one of those sections has a list of code that I want to copy and paste into my text editor.

To clarify, my question is a general one with this page being an example.
jcimarron: thanks but that does not seem to get the job done:
pasting in html or rtf format addresses the lack of line terminators, but it does not retain the formatting (i.e. the indents)
pasting in unformatted text results in no line terminators
Are you only interested in line terminators or making it look like the page?  Are you talking about the script box's?   What text editor are you using? browser?   I tried copying and pasting what is in 2.CSS and 3.Javascript and all the line breaks are there.
padas - thanks for your assistance.

>> Are you only interested in line terminators or making it look like the page?
I want any plain text that I select (i.e. anything where style is irrelevant such as the code examples on that page) to be pasted as it appears into a text editor (i.e. html is stripped out).  

>> What text editor are you using?
UltraEdit

>> browser?
IE9

>> I tried copying and pasting what is in 2.CSS and 3.Javascript and all the line breaks are there.
Did you paste it into a text editor (e.g. notepad) or into something like word which retains the html?
On the Mac I tried both crhome and firefox and pasted to dreamweaver, sublime, text edit, text wrangler and eclips.  On the pc I tried ie9 with notepad++.  I also am pasting here.  

<script>
 
$(document).ready(function() {  
 
    //select all the a tag with name equal to modal
    $('a[name=modal]').click(function(e) {
        //Cancel the link behavior
        e.preventDefault();
        //Get the A tag
        var id = $(this).attr('href');
     
        //Get the screen height and width
        var maskHeight = $(document).height();
        var maskWidth = $(window).width();
     
        //Set height and width to mask to fill up the whole screen
        $('#mask').css({'width':maskWidth,'height':maskHeight});
         
        //transition effect     
        $('#mask').fadeIn(1000);    
        $('#mask').fadeTo("slow",0.8);  
     
        //Get the window height and width
        var winH = $(window).height();
        var winW = $(window).width();
               
        //Set the popup window to center
        $(id).css('top',  winH/2-$(id).height()/2);
        $(id).css('left', winW/2-$(id).width()/2);
     
        //transition effect
        $(id).fadeIn(2000); 
     
    });
     
    //if close button is clicked
    $('.window .close').click(function (e) {
        //Cancel the link behavior
        e.preventDefault();
        $('#mask, .window').hide();
    });     
     
    //if mask is clicked
    $('#mask').click(function () {
        $(this).hide();
        $('.window').hide();
    });         
     
});
 
</script>

Open in new window

Hmmm... I installed Notepad++ to recreate your results.  But I got the same results as UltraEdit, and Notepad.  Let me copy and paste below (I selected the text between and including <script> ... </script> from the web page and pasted below):

<script> $(document).ready(function() {       //select all the a tag with name equal to modal    $('a[name=modal]').click(function(e) {        //Cancel the link behavior        e.preventDefault();        //Get the A tag        var id = $(this).attr('href');             //Get the screen height and width        var maskHeight = $(document).height();        var maskWidth = $(window).width();             //Set height and width to mask to fill up the whole screen        $('#mask').css({'width':maskWidth,'height':maskHeight});                 //transition effect             $('#mask').fadeIn(1000);            $('#mask').fadeTo("slow",0.8);               //Get the window height and width        var winH = $(window).height();        var winW = $(window).width();                       //Set the popup window to center        $(id).css('top',  winH/2-$(id).height()/2);        $(id).css('left', winW/2-$(id).width()/2);             //transition effect        $(id).fadeIn(2000);          });         //if close button is clicked    $('.window .close').click(function (e) {        //Cancel the link behavior        e.preventDefault();        $('#mask, .window').hide();    });              //if mask is clicked    $('#mask').click(function () {        $(this).hide();        $('.window').hide();    });              }); </script>

Open in new window


In my case, I find that the text is pasted as 1 line in UltraEdit, Notepad++ and in my comment here at ee.  In Notepad, it pastes as 2 lines (seems Notepad doesn't support lines > 1024 characters and breaks the line into 2).

I wonder why my experience is different than yours with IE9 and Notepad++ / ee forum?

Can anyone else try using IE9 and Notepad or Notepad++?
Are you "viewing source" or copying from the browser?  I am copying from the browser.  Since we are pasting to a text editor we are not pasting any html but the browser is creating the line breaks.
I am copying from the browser.  No line breaks for me when I paste to editor.
I also installed firefox but am getting the same results.

Installing a cliboard viewer shows that CF_TEXT has no line breaks - I would love to know how you are getting them!
ASKER CERTIFIED SOLUTION
Avatar of Scott Fell
Scott Fell
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Same results with firefox (see my last post)
I'm sorry, I am out of suggestions....
Thanks for all the time you spent helping me.  Much appreciated.  

It seems the easiest solution was to install chrome - seems chrome converts <br> to line terminators during the copy operation whereas IE9 and Firefox (windows) did not for me.

With Regards-
Sam