I Am Looking For A 'Complete' Webpage/Website Capturing Software For Windows 7 64-Bit As True To Form To The Original Including Embedded Images Represented And Provide Active Hyperlinks Too?

Bazingeroo
Bazingeroo used Ask the Experts™
on
Hello. I have a question regarding Windows 7 64-bit operating system.

I am looking for a 'complete' feature set of features and functions for webpage/website capturing software as true to form to the original including embedded images represented and provides active hyperlinks too?

Yes, we can copy all webpage content with out mouse or other input device and have it pasted into a program like a Word processing program like Microsoft Office Word 2003/2007/2010. However, images and other content can be missing. We can use screen capture software programs or simply, the “PrtScn” (Print screen) button on our keyboard too; but that creates an often identical screen image of the webpage, but we lack the active hyperlinks – obviously it is 'an image' of the original webpage.

So what is out there on the World Wide Web market for a software program that achieves full webpage/website capture that produces an identical capture looks, acts, and behaves in form, features and functions to the original webpage/website?

I am sure such a program will have an associated monetary cost for such a comprehensive set of functions and features. That I understand. However, please be reasonable and sensible on the price offered. Yes, if you have 'any' suggestions or recommendations 'regardless' of price, please let me know too.  

Please reply.

Thank you!
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Fixer of Problems
Most Valuable Expert 2014
Commented:
No, actually it's free.  http://www.httrack.com/page/2/en/index.html  The catch is... just like a web browser, all you can 'capture' is the publicly visible part of a web site.  You can't get any of the server side code without being able to login to that hosting account.
actually it's even open source as well as free (it's commandline too which is a big plus for scripting/scheduling etc):

http://www.gnu.org/software/wget/

As Dave says, it is impossible to capture anything but the publicly visible part of the site. All dynamic server side behavior is inaccessible to you no matter what you use.
Top Expert 2013
Commented:
Bazingeroo--Try File|Save As when on the webpage and choose "Web Archive, Single File (*.mht)" from the drop down "Save as type" menu.
Expert Spotlight: Joe Anderson (DatabaseMX)

We’ve posted a new Expert Spotlight!  Joe Anderson (DatabaseMX) has been on Experts Exchange since 2006. Learn more about this database architect, guitar aficionado, and Microsoft MVP.

Dave BaldwinFixer of Problems
Most Valuable Expert 2014

Commented:
"Web Archive, Single File (*.mht)" is only available in IE.
..... and also Opera browser:
Save As > "Web Archive Single File"
You have to add the *.mht extension to the file name offered though, or else it saves as a file with no extension.

Author

Commented:
@ DaveBaldwin, basicinstinct, jcimarron, & BillDL:

Hello. Nice to see you DaveBaldwin, basicinstinct, and BillDL. Nice to meet you jcimarron.

Thank you all for your comments.
 
Few questions to ask one of you all before I conclude this question/thread. I will keep it simple:

>basicinstinct:

I see from the weblink for your recommendation that the downloads for Wget is applicable from what I have briefly reviewed and read from the website for Linux/GNU operating systems and not Windows based operating systems. I assume you probably been so attuned to my asking questions pertaining to Linux/GNU, that this one took you off guard? If it did that is okay. No problem. I can use it on my Linux Ubuntu v.11.10 64-bit operating system. Thanks for that one. If you know if a Windows platform download weblink as well, please reply with it. Thank you!

> jcimarron, DaveBaldwin, & BillDL (I will accept a resonse from anyone of you to conclude this part of the question.):

Hello. I see you are using the simplest and quickest and most capture capable with embedded images and weblinks built right into the Microsoft web browser and Opera in the *.mht file format ("Web archive, single file" or "Web archive (single file)") . It is accurate and precise and complete in the processes it stores while capturing. I cannot recall even the other popular web browsers even having this close of an accurate and precise capture webpage/website capability. Maybe they do and I do not know they have this capability too? A plus for Internet Explorer and Opera.

However, I have to ask, is there a way or means to capture one of the most important components  of webpage/website capturing for reference resourcing -- the URL??? The URL for an *.mht file is the full file path directory of where the *mht file is saved. Is there some means of capturing the specific URL webpage/website as part of the capturing process of the *.mht file too? Please explain if there is is a way or means to accomplish this too... ...or is in there and just maybe I do not see it?  

Please reply.

Thank you!
You should be able to find wget for windows too, have a look here: http://gnuwin32.sourceforge.net/packages/wget.htm
Top Expert 2013

Commented:
Bazingeroo--You can find the original URL of the .mht file when you right click on a blank spot and choose View Source.  However it might be best to copy the URL when you create the .mht file and add it to the File Name of the .mht file

Author

Commented:
@ DaveBaldwin, basicinstinct, jcimarron, & BillDL:

Hello again.

Thank you for your latest comments since my last posted comment.

Okay, where do I begin? I have seriously investigating all your answers.

Each one of your expert comments are answers and solutions to my issues. Therefore each one of you will be awarded points. However there are significant differences in each one of your answers/solutions provided. Some are easier and faster to use than others to achieve the 'varied' desired results. Other answers/solutions provide captured images and components of the webpage into pieces or complete pages like "mirroring" and even updating your captured downloads since your last website/webpage capture. One even tests your weblinks while in the process of capturing. There are two ways of culminating your answers into the factors I deem significant into measurable criteria of evaluating this question/thread:

1. Ease of use?

2. Level of completeness?

From the premise of my OP or initial post, I was looking for number 2 obviously as you can tell. However, I never considered number 1 in my assessment of my initial question. But I will now add it. However since I never addressed 'ease of use' as a desired factor in my initial post, therefore in terms of the awarding points on this measure will be lessened. Basically, keeping both factors of equal measure would not be fair to all you. So in evaluating points with my internal points system for this question/thread, 10 points spread for 'level of completeness' (represents greater strength based on what I just stated) and 5 points spread (represents lesser strength based on what I just stated) for 'ease of use'.  

So evaluating 'level of completeness' (1-10, 10 best)  in contrast with 'ease of use' (1-5, 5 best); lets go through each experts' comments where the answers/solutions were posted only (Your further explanations that follows my second comment are factored in, but not directly evaluated.):

Number 1: DaveBaldwin's  HTTrack Website Copier - mega loaded program -- what cannot it do! I will spare you the long laundry list of details here, but I did address a few above already. However,  you have to set it up in order to run it. ...and if you have a large website with many components to the website, it can take a while depending on the settings you preselect. Level of completeness: 10 Ease of use: 3 Total: 13/15

Number 2: basicinstinct's GNU WGet program -- very impressive as long as you know the vast array of line commands in cmd.exe. Yes, I found the PDF document that does a great job explaining the commands one can use, but to put them together as a whole is a bit of a steep learning curve. It can do much that HTTrack Copier can do, but as simple functions and not as great a job with grouped functions. It runs around the same completion time as HTTrack Website Copier, maybe a little faster at times. Level completeness: 8 Ease of use: 1 Total: 9/15

Number 3: jcimarron's (with clarification from DaveBaldwin's comment that I believe jcimarron knew, just never stated) Internet Explorer using "Save as" a *.mht file. Now, this one and the expert one I will address next hit the 'ease of use' with a perfect 5 undoubtedly. The 'level of completeness' is very good too. I like how it keeps the webpage as a whole and the weblink there as active. Unlike the other two programs that I have address that have the ability to save and store the images if desired, depending on the settings you select; here they are imbedded with the capture of the webpage/website as a whole. Again, not a major concern. Also, during my testing, I find some instances where the image captured with this method delivered a black framed area of where the image was on the actual webpage/website. I will say, 'level of completeness' is a 7. Total: 12/15

Number 4: BillDL's similar answer like jcimarron's answer using Opera web browser and "Save as" an *.mht file tested out the same results in every measure to the conclusion I found in jcimarron's answer/solution with the same issues. Yes, I have to type four characters to the file name extension (*.mht), unlike Internet Explorer, but who cares! This is frivolous to do. 'Ease of use' is definitely: 5 Level of completeness is again: 7 Total: 12/15

So equating the point totals here into EE awarded points is:

Number 1: DaveBaldwin's  HTTrack Website Copier: 142 total points/500 points(yes, I had 1 extra point left as a remainder from the necessary rounding and I thought it would be easier to just add the 1 point to the actual total 141 points since DaveBaldwin IS the Accepted Solution anyways.)                        

Number 2: basicinstinct's GNU WGet program: 98 total points/500 points  

Number 3: jcimarron's Internet Explorer using "Save as" a *.mht file: 130 total points/500 points

Number 4: BillDL's Opera using "Save as" a *.mht file: 130 total points/500 points

...and (as I previously indicated above) the one that wins the MOST points is the one Accepted Solution -- that is DaveBaldwin's answer/solution is the best. This is the fairest method I could develop for a question/thread of this type.

Thank you for your help!!!
Dave BaldwinFixer of Problems
Most Valuable Expert 2014

Commented:
HTTrack will record the URL that a file is copied from in that file as a comment.
Top Expert 2013

Commented:
Bazingeroo--Thanks for the comments.  Glad you found what you want.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial