Solved

Convert dynamic URL's with wget to .html files

Posted on 2011-03-22
4
652 Views
Last Modified: 2012-05-11
Hello,

I'm have a problem when I use wget to download/archive a webpage.

If I download "http://example.com" it works fine also if I download "http://example.com/page.html" that's OK too.

My problem is when I have a URL something like this:

"http://example.com/page.php?id=99"
OR
"http://example.com/index.html?hpt=T1"

These download fine but when I browse to them the page that shows is the HTML code not the browser rendered version.

So the question is how can I force all pages to become .htm or .html files

Here is my code:

<?php

$site = 'http://example.com/index.php?id=680';

$rnd1 = rand(100, 9999);
$rnd2 = rand(100, 9999);

mkdir("/home/USER/public_html/results/". $rnd1 . "/", 0777);
mkdir("/home/USER/public_html/results/". $rnd1 . "/". $rnd2 ."/", 0777);

exec("wget -e robots=off --limit-rate=250k -F -P /home/USESR/public_html/results/". $rnd1 ."/". $rnd2 ."/"." -p -k ". $site ."");

?> 

Open in new window



Thanks for the help!
0
Comment
Question by:jambla
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
4 Comments
 
LVL 5

Expert Comment

by:tsmgeek
ID: 35194479
im guessing the problem you are having is the files do not actualy have .html on the end but instead its got the query params concatinated on the end, you need to change this or append .html to the end of every file

personaly i would use curl to get the page then save it into a file that i name myself
0
 

Author Comment

by:jambla
ID: 35196152
Hello tsmgeek,

Thanks for your response.

im guessing the problem you are having is the files do not actualy have .html on the end but instead its got the query params concatinated on the end, you need to change this or append .html to the end of every file

Yeah, I'm pretty sure that's the problem.  Which is the main point of my questions; how do I do this?

personaly i would use curl to get the page then save it into a file that i name myself

Yeah, I prefer cURL also, my big problem with curl is I was only able to save the html but I was not able to save the css, images, js etc...  I am not partial to using wget so if you know how to do what I need using curl or any other web language (except .asp/.net) than I'm ok with that.

0
 

Accepted Solution

by:
jambla earned 0 total points
ID: 35197493
I managed to find the answer.  Using a -E in my wget statement will force a non-html extension to be one.
0
 

Author Closing Comment

by:jambla
ID: 35230010
I found my own solution.
0

Featured Post

Creating Instructional Tutorials  

For Any Use & On Any Platform

Contextual Guidance at the moment of need helps your employees/users adopt software o& achieve even the most complex tasks instantly. Boost knowledge retention, software adoption & employee engagement with easy solution.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Generate Unique ID in VB.NET 21 105
Developing a front end to SPLUNK 1 65
Getting Variable not defined error in Python 1 48
Do not understand error message 3 29
Nothing in an HTTP request can be trusted, including HTTP headers and form data.  A form token is a tool that can be used to guard against request forgeries (CSRF).  This article shows an improved approach to form tokens, making it more difficult to…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
An introduction to basic programming syntax in Java by creating a simple program. Viewers can follow the tutorial as they create their first class in Java. Definitions and explanations about each element are given to help prepare viewers for future …
The viewer will learn how to dynamically set the form action using jQuery.

730 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question