Saving a webpage to your HD then viewing it directly (ie without the internet)

Posted on 2016-09-22
1 Endorsement
Last Modified: 2016-10-01

I generally use Google Chrome or Mozilla Firefox for accessing the internet. With each of these browsers, you can right-click anywhere on the page and bring up the option to "Save as..." or "Save Page as...". Then after selecting a folder and pressing {Enter}, the browser shows that a download has occurred from that site.

1st Question
When the above steps are taken, what file(s) actually gets saved to your HD? Is it:
    a) simply a link to the page's URL or
    b) some portion of the page or
    c) the entire page with all of its content?

2nd Question
Once the file has been saved to your HD folder, what exactly happens when you double-click to open it? Does it
    a) go to the internet and bring up the site in same way it did in the first place (ie before the "Save as..." process) or
    b) open the page directly from your HD without any internet involvement or
    c) utilize a combination of the file you saved locally and content from the actual website?

3rd Question
Suppose a particular webpage has a very slow load time due to either slowness in the site itself or the amount of traffic accessing the site. Is there a way to save the webpage to your HD so that you can quickly open and view the content without going to the site — even though your system is connected to the internet?

Question by:WeThotUWasAToad
LVL 82

Assisted Solution

by:Dave Baldwin
Dave Baldwin earned 300 total points
Comment Utility
It depends.  While Firefox will save the complete current page, you may still have content that accesses the internet.  Double clicking on a simple page will load it directly from the file you saved.  Pages that include things like AJAX routines that load content after the page is loaded will still try to access that content over the internet.

Saving a slow page will not necessarily speed it if the slowdown is because of database access which still requires internet access.  In fact, the page may not load at all from a local file if it requires script files that are only on the server and not saved locally.

So... it depends.  Try saving it to your computer and then see if it works when you try to open it locally.

Author Comment

Comment Utility
Thanks for the reply Dave.

Try saving it to your computer and then see if it works when you try to open it locally.
I tried that prior to my OP and you are right, it was actually slower than going to the site itself.

More details:

I'm playing World of Warcraft and Blizzard just launched their newest expansion (Legion) less than a month ago. I have not been subscribed/playing during the launch of a new expansion before but apparently it generates a lot of new subscriptions and re-subscriptions.

The most popular (independent) WoW fansite is which is where I've traditionally gone to research or get answers to questions. Some of their pages are unbearably slow right now and I assume that it's due to all the extra traffic. Am I correct that an overabundance of traffic will slow down a website?

As a result, I was hoping that I could download and save some of the main resource pages I use and then be able to simply access them directly from my HD after that.

Is there another method or workaround that could accomplish that?

Of course their pages have all sorts of ads and links but I'm mainly interested in the content. Having the links found in the main article would be helpful also if that's possible.

In the past, I've tried selecting the main article and doing a simple copy/paste but that almost always trashes the formatting and using Paste Value or Paste Text Only is not much better.

Any thoughts?

Thanks again.
LVL 23

Assisted Solution

by:Dr. Klahn
Dr. Klahn earned 100 total points
Comment Utility
Dave is 100% correct.  Whether a page can be saved, and still be readable, is unpredictable without actually trying it.

One way to make a saved page useless offline is to take the FQDN out of URLs in the page source.  Consider:

Now insert the four left-handed lag screws according to the image below:<BR>
<IMG SRC="/images/assembly1.gif" ALT="Your downloaded page is futile">

Open in new window

The IMG tag lacks the FQDN preceding the path to the image.  As such, the only place it will work is on the original web site.  This is not necessarily done with the intent to make the page useless offline; from the web site manager's point of view, internal URLs without an FQDN are much easier to re-root at another folder, or even on another server.

The same applies for script, Perl and other "active" URLs.  Except more so.

There is so much cross-site scripting going on now on most web pages that I'm generally surprised when a saved page does anything useful at all.
LVL 87

Assisted Solution

rindi earned 100 total points
Comment Utility
One thing that can help speed up access times to web content you have visited before, is to use a server at home which has a proxy server installed. Proxy servers can be configured as caching proxy's, and so if you visit a website, it's contents get cached on your server. If you visit that site again later or via another PC on your LAN, it loads faster as the content is first loaded from the proxy's cache.
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.


Author Comment

Comment Utility
Thanks for the responses.
...use a server at home which has a proxy server installed.
Is that a hard thing to do? I mean, doesn't it require additional hardware, software, and know-how?

Accepted Solution

WeThotUWasAToad earned 0 total points
Comment Utility
FYI I just revisited the poor man's technique I mentioned earlier which is to simply select just the article itself and paste it into a Word document.

I haven't tried that for several years — certainly before Word 2010 which is my current version of choice — but the results are surprisingly good (ie definitely good enough for just quickly seeing a particular page). And the links are preserved.

For my own reference (and in case anyone else is interested), the page I used is here:

And following are the steps:

1) selected from the first word "Legion" in the title down to the paragraph ending with "Hidden Potential."

2) did a simple Ctrl+c followed by Ctrl+v in a new Word doc.

3) determined the webpage color (using a handy freeware utility called Meazure) and found it to be |036 036 036|

4) in Word, went to Page Layout > Page Color and specified the same RGB settings.

The only remaining issue was that, although the majority of links retained their yellow/gold color, the links adjacent to small image icons were completely blue |000 000 255| and difficult to read against the dark background. I tried modifying the Hyperlink Style but that did nothing.

But this did the trick:

5) using the Find and Replace box, I formatted the "Find what" field to |000 000 255| and the "Replace with" field to |000 112 221| (the color of those links on the website). Then clicking "Replace All" fixed the issue.

Oh, I also:

6) entered a double line feed wherever text butted-up to an image.

And Voila! I had a perfect replica of the webpage with all needed functionality included.

Again, that's quite a remedial approach but hey, it's great for what I'm after.
LVL 87

Expert Comment

Comment Utility
Yes, for a proxy, you need another PC, install a Linux Distro to it, then install Squid proxy and configure it so that it acts as a cache, and then add the proxy server's address in the proxy settings of your PC's web-browsers.

Author Closing Comment

Comment Utility
Thanks for the feedback.

I flagged my own solution so it can be quickly identified in the future.

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

Due to recent concerns over the inevitable depletion of the current pool of IPv4 addresses and the desire to provide additional functionality for modern devices, an upgrade to IPv6 on my Internet connection was needed for me to explore the world of …
SSL stands for “Secure Sockets Layer” and an SSL certificate is a critical component to keeping your website safe, secured, and compliant. Any ecommerce website must have an SSL certificate to ensure the safe handling of sensitive information like…
This Micro Tutorial will demonstrate how to add subdomains to your content reports. This can be very importing in having a site with multiple subdomains.
How to create a custom search shortcut to site-search Experts Exchange using Google in the Firefox browser. This eliminates the need to type out whenever you want to search the site. Launch your Bookmark Menu: Press 'Ctrl +…

772 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now