Solved

Download Website from Google Cache

Posted on 2008-06-11
9
311 Views
Last Modified: 2013-12-09
I recently suffered a SQL Injection attach on one of my websites which meant a large proportion of my MS-SQL tables were over-written with Malware scripts.
I searched Google to find that c.480 pages are cached.
Is there a quick way of downloading all these cached pages in one go, using software?
I dont fancy having to click on each cached link and go File > Save in my browser
0
Comment
Question by:fgict
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 2
  • 2
9 Comments
 
LVL 10

Expert Comment

by:bluefezteam
ID: 21760084
Bluesquirrel WebWhacker can be unleashed on a website to save it and its assets, try pointing it at the cache?

Alternatively use it on the way back machine here
http://www.archive.org/index.php

It may be logged in there - my sites in there, so I can actually see what it looked like after 3 redesigns; cool stuff.
0
 
LVL 29

Expert Comment

by:fibo
ID: 21766663
Have you tried contacting Google?
0
 
LVL 1

Author Comment

by:fgict
ID: 21767222
Hi fibo

No I have not tried contacting Google.
How would I go about that? what do you think they could provide me?
0
MIM Survival Guide for Service Desk Managers

Major incidents can send mastered service desk processes into disorder. Systems and tools produce the data needed to resolve these incidents, but your challenge is getting that information to the right people fast. Check out the Survival Guide and begin bringing order to chaos.

 
LVL 10

Accepted Solution

by:
bluefezteam earned 500 total points
ID: 21767243
Although Google are a very friendly company, I doubt they are going to act on this an time soon for you - if at all, they will say to just view and save from their cache (which you already know about.)

If you know where the pages are in their cache you can use the automated approach that I mentioned in the first comment - the longer you wait (especially for a reply and action from Google) the greater the chance your site will go from their cache next time they visit and be gone forever...

Alternatively, the WAY BACK MACHINE stores copies of peoples websites, you may be in look and they have a copy - you are more likely to get a reply and action from them than the Googleplex.

On something like this however, time is of the essence so at least try the suggestions I made while waiting for feedback from Google, because as the old saying goes... once it's gone, it's gone!

Good luck
0
 
LVL 1

Author Comment

by:fgict
ID: 21767513
Hi bluefezteam

I have review your answer re: downloading from google using website downloader software but the big problem I have is:
When I point it to a cached page e.g. http://64.233.183.104/search?q=cache:xxxxx
The links within this page are not cached links but the website links so it will spider all the dynamic pages I have on the site e.g. news.asp and this will return the corrupted content.

I dont know how else to download the cached pages without doing it manually one by one
0
 
LVL 10

Expert Comment

by:bluefezteam
ID: 21767803
Hmm try the way back machine on archive.org you may have some luck there.
Is there a way of listing all the pages that you need to save?

For example do you still have the sitemap structure as a google sitemap (XMl doc) - if you do then maybe it's possible to apply that structure to googles cache and create a routine to strip all content from those page links defines in the site map.

you should be able to strip out the text from the pages automatically using some form of PHP/.Net routine applied to a recursive loop controlled by the sitemap

0
 
LVL 29

Expert Comment

by:fibo
ID: 21893296
fqict,
what is now the situation?
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Any business that wants to seriously grow needs to keep the needs and desires of an international audience of their websites in mind. Making a website friendly to international users isn’t prohibitively expensive and can provide an incredible return…
FAQ pages provide a simple way for you to supply and for customers to find answers to the most common questions about your company. Here are six reasons why your company website should have a FAQ page
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
The is a quite short video tutorial. In this video, I'm going to show you how to create self-host WordPress blog with free hosting service.

749 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question