Solved

Download Website from Google Cache

Posted on 2008-06-11
9
300 Views
Last Modified: 2013-12-09
I recently suffered a SQL Injection attach on one of my websites which meant a large proportion of my MS-SQL tables were over-written with Malware scripts.
I searched Google to find that c.480 pages are cached.
Is there a quick way of downloading all these cached pages in one go, using software?
I dont fancy having to click on each cached link and go File > Save in my browser
0
Comment
Question by:fgict
  • 3
  • 2
  • 2
9 Comments
 
LVL 10

Expert Comment

by:bluefezteam
ID: 21760084
Bluesquirrel WebWhacker can be unleashed on a website to save it and its assets, try pointing it at the cache?

Alternatively use it on the way back machine here
http://www.archive.org/index.php

It may be logged in there - my sites in there, so I can actually see what it looked like after 3 redesigns; cool stuff.
0
 
LVL 29

Expert Comment

by:fibo
ID: 21766663
Have you tried contacting Google?
0
 
LVL 1

Author Comment

by:fgict
ID: 21767222
Hi fibo

No I have not tried contacting Google.
How would I go about that? what do you think they could provide me?
0
Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

 
LVL 10

Accepted Solution

by:
bluefezteam earned 500 total points
ID: 21767243
Although Google are a very friendly company, I doubt they are going to act on this an time soon for you - if at all, they will say to just view and save from their cache (which you already know about.)

If you know where the pages are in their cache you can use the automated approach that I mentioned in the first comment - the longer you wait (especially for a reply and action from Google) the greater the chance your site will go from their cache next time they visit and be gone forever...

Alternatively, the WAY BACK MACHINE stores copies of peoples websites, you may be in look and they have a copy - you are more likely to get a reply and action from them than the Googleplex.

On something like this however, time is of the essence so at least try the suggestions I made while waiting for feedback from Google, because as the old saying goes... once it's gone, it's gone!

Good luck
0
 
LVL 1

Author Comment

by:fgict
ID: 21767513
Hi bluefezteam

I have review your answer re: downloading from google using website downloader software but the big problem I have is:
When I point it to a cached page e.g. http://64.233.183.104/search?q=cache:xxxxx
The links within this page are not cached links but the website links so it will spider all the dynamic pages I have on the site e.g. news.asp and this will return the corrupted content.

I dont know how else to download the cached pages without doing it manually one by one
0
 
LVL 10

Expert Comment

by:bluefezteam
ID: 21767803
Hmm try the way back machine on archive.org you may have some luck there.
Is there a way of listing all the pages that you need to save?

For example do you still have the sitemap structure as a google sitemap (XMl doc) - if you do then maybe it's possible to apply that structure to googles cache and create a routine to strip all content from those page links defines in the site map.

you should be able to strip out the text from the pages automatically using some form of PHP/.Net routine applied to a recursive loop controlled by the sitemap

0
 
LVL 29

Expert Comment

by:fibo
ID: 21893296
fqict,
what is now the situation?
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Get to know the ins and outs of building a web-based ERP system for your enterprise. Development timeline, technology, and costs outlined.
This tutorial demonstrates how to identify and create boundary or building outlines in Google Maps. In this example, I outline the boundaries of an enclosed skatepark within a community park.  Login to your Google Account, then  Google for "Google M…
This video teaches users how to migrate an existing Wordpress website to a new domain.

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now