Link to home
Get AccessLog in
Avatar of Matteo P
Matteo PFlag for Canada

asked on

WGET Command to Download Website

I am trying to restore a website from webarchive for a client (wp site was hacked), i am using wget to try and download the files, it makes it a bit difficult because its a wordpress site, its only three pages, I just want to download all the HTML, CSS, JS and images for the three pages but its proving to be abit more difficult than that. I was using --accept to specify files but a lot of the css and js has caching after the file type so its skipping the, eg style.css_ver=1.23 etc


wget --recursive --no-clobber --recursive --no-check-certificate --no-directories -P /var/www/site https://web.archive.org/web/20220327200154/http://mysite.com/


Is there a better way i can do this?





ASKER CERTIFIED SOLUTION
Avatar of ste5an
ste5an
Flag of Germany image

Link to home
membership
This content is only available to members.
To access this content, you must be a member of Experts Exchange.
Get Access
Avatar of Matteo P

ASKER

I can save the pages but there is a lot of CSS and JS files i need to recursively download as saving the HTML will still link to webarchive for the images, css and js.

will try with your regex. Thanks
Hi,

I would check maybe you have a backup even if this is not the most recent this could help you to get back the design, ask your web hosting provider they may have some backup.

I would not recommend to use WP or a CMS for a 3 pages website.

I would start from scratch as just recovering html page will not help you to get back the WP site.

Next time make sure to backup or ask your web hosting provider to set that for you.

I agree with the part about just starting from scratch. It will be easier to grab the photos by right clicking and downloading them if you don't already have those. Then recreate the three pages even if it is a different theme. Make it easy on yourself. You could spend 10 times the amount of time you need by trying to save the thing than just recreating it.