Link to home
Start Free TrialLog in
Avatar of Bazingeroo
BazingerooFlag for Afghanistan

asked on

I Am Looking For A 'Complete' Webpage/Website Capturing Software For Windows 7 64-Bit As True To Form To The Original Including Embedded Images Represented And Provide Active Hyperlinks Too?

Hello. I have a question regarding Windows 7 64-bit operating system.

I am looking for a 'complete' feature set of features and functions for webpage/website capturing software as true to form to the original including embedded images represented and provides active hyperlinks too?

Yes, we can copy all webpage content with out mouse or other input device and have it pasted into a program like a Word processing program like Microsoft Office Word 2003/2007/2010. However, images and other content can be missing. We can use screen capture software programs or simply, the “PrtScn” (Print screen) button on our keyboard too; but that creates an often identical screen image of the webpage, but we lack the active hyperlinks – obviously it is 'an image' of the original webpage.

So what is out there on the World Wide Web market for a software program that achieves full webpage/website capture that produces an identical capture looks, acts, and behaves in form, features and functions to the original webpage/website?

I am sure such a program will have an associated monetary cost for such a comprehensive set of functions and features. That I understand. However, please be reasonable and sensible on the price offered. Yes, if you have 'any' suggestions or recommendations 'regardless' of price, please let me know too.  

Please reply.

Thank you!
ASKER CERTIFIED SOLUTION
Avatar of Dave Baldwin
Dave Baldwin
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
"Web Archive, Single File (*.mht)" is only available in IE.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of Bazingeroo

ASKER

@ DaveBaldwin, basicinstinct, jcimarron, & BillDL:

Hello. Nice to see you DaveBaldwin, basicinstinct, and BillDL. Nice to meet you jcimarron.

Thank you all for your comments.
 
Few questions to ask one of you all before I conclude this question/thread. I will keep it simple:

>basicinstinct:

I see from the weblink for your recommendation that the downloads for Wget is applicable from what I have briefly reviewed and read from the website for Linux/GNU operating systems and not Windows based operating systems. I assume you probably been so attuned to my asking questions pertaining to Linux/GNU, that this one took you off guard? If it did that is okay. No problem. I can use it on my Linux Ubuntu v.11.10 64-bit operating system. Thanks for that one. If you know if a Windows platform download weblink as well, please reply with it. Thank you!

> jcimarron, DaveBaldwin, & BillDL (I will accept a resonse from anyone of you to conclude this part of the question.):

Hello. I see you are using the simplest and quickest and most capture capable with embedded images and weblinks built right into the Microsoft web browser and Opera in the *.mht file format ("Web archive, single file" or "Web archive (single file)") . It is accurate and precise and complete in the processes it stores while capturing. I cannot recall even the other popular web browsers even having this close of an accurate and precise capture webpage/website capability. Maybe they do and I do not know they have this capability too? A plus for Internet Explorer and Opera.

However, I have to ask, is there a way or means to capture one of the most important components  of webpage/website capturing for reference resourcing -- the URL??? The URL for an *.mht file is the full file path directory of where the *mht file is saved. Is there some means of capturing the specific URL webpage/website as part of the capturing process of the *.mht file too? Please explain if there is is a way or means to accomplish this too... ...or is in there and just maybe I do not see it?  

Please reply.

Thank you!
You should be able to find wget for windows too, have a look here: http://gnuwin32.sourceforge.net/packages/wget.htm
Bazingeroo--You can find the original URL of the .mht file when you right click on a blank spot and choose View Source.  However it might be best to copy the URL when you create the .mht file and add it to the File Name of the .mht file
@ DaveBaldwin, basicinstinct, jcimarron, & BillDL:

Hello again.

Thank you for your latest comments since my last posted comment.

Okay, where do I begin? I have seriously investigating all your answers.

Each one of your expert comments are answers and solutions to my issues. Therefore each one of you will be awarded points. However there are significant differences in each one of your answers/solutions provided. Some are easier and faster to use than others to achieve the 'varied' desired results. Other answers/solutions provide captured images and components of the webpage into pieces or complete pages like "mirroring" and even updating your captured downloads since your last website/webpage capture. One even tests your weblinks while in the process of capturing. There are two ways of culminating your answers into the factors I deem significant into measurable criteria of evaluating this question/thread:

1. Ease of use?

2. Level of completeness?

From the premise of my OP or initial post, I was looking for number 2 obviously as you can tell. However, I never considered number 1 in my assessment of my initial question. But I will now add it. However since I never addressed 'ease of use' as a desired factor in my initial post, therefore in terms of the awarding points on this measure will be lessened. Basically, keeping both factors of equal measure would not be fair to all you. So in evaluating points with my internal points system for this question/thread, 10 points spread for 'level of completeness' (represents greater strength based on what I just stated) and 5 points spread (represents lesser strength based on what I just stated) for 'ease of use'.  

So evaluating 'level of completeness' (1-10, 10 best)  in contrast with 'ease of use' (1-5, 5 best); lets go through each experts' comments where the answers/solutions were posted only (Your further explanations that follows my second comment are factored in, but not directly evaluated.):

Number 1: DaveBaldwin's  HTTrack Website Copier - mega loaded program -- what cannot it do! I will spare you the long laundry list of details here, but I did address a few above already. However,  you have to set it up in order to run it. ...and if you have a large website with many components to the website, it can take a while depending on the settings you preselect. Level of completeness: 10 Ease of use: 3 Total: 13/15

Number 2: basicinstinct's GNU WGet program -- very impressive as long as you know the vast array of line commands in cmd.exe. Yes, I found the PDF document that does a great job explaining the commands one can use, but to put them together as a whole is a bit of a steep learning curve. It can do much that HTTrack Copier can do, but as simple functions and not as great a job with grouped functions. It runs around the same completion time as HTTrack Website Copier, maybe a little faster at times. Level completeness: 8 Ease of use: 1 Total: 9/15

Number 3: jcimarron's (with clarification from DaveBaldwin's comment that I believe jcimarron knew, just never stated) Internet Explorer using "Save as" a *.mht file. Now, this one and the expert one I will address next hit the 'ease of use' with a perfect 5 undoubtedly. The 'level of completeness' is very good too. I like how it keeps the webpage as a whole and the weblink there as active. Unlike the other two programs that I have address that have the ability to save and store the images if desired, depending on the settings you select; here they are imbedded with the capture of the webpage/website as a whole. Again, not a major concern. Also, during my testing, I find some instances where the image captured with this method delivered a black framed area of where the image was on the actual webpage/website. I will say, 'level of completeness' is a 7. Total: 12/15

Number 4: BillDL's similar answer like jcimarron's answer using Opera web browser and "Save as" an *.mht file tested out the same results in every measure to the conclusion I found in jcimarron's answer/solution with the same issues. Yes, I have to type four characters to the file name extension (*.mht), unlike Internet Explorer, but who cares! This is frivolous to do. 'Ease of use' is definitely: 5 Level of completeness is again: 7 Total: 12/15

So equating the point totals here into EE awarded points is:

Number 1: DaveBaldwin's  HTTrack Website Copier: 142 total points/500 points(yes, I had 1 extra point left as a remainder from the necessary rounding and I thought it would be easier to just add the 1 point to the actual total 141 points since DaveBaldwin IS the Accepted Solution anyways.)                        

Number 2: basicinstinct's GNU WGet program: 98 total points/500 points  

Number 3: jcimarron's Internet Explorer using "Save as" a *.mht file: 130 total points/500 points

Number 4: BillDL's Opera using "Save as" a *.mht file: 130 total points/500 points

...and (as I previously indicated above) the one that wins the MOST points is the one Accepted Solution -- that is DaveBaldwin's answer/solution is the best. This is the fairest method I could develop for a question/thread of this type.

Thank you for your help!!!
HTTrack will record the URL that a file is copied from in that file as a comment.
Bazingeroo--Thanks for the comments.  Glad you found what you want.