Want to win a PS4? Go Premium and enter to win our High-Tech Treats giveaway. Enter to Win

x
?
Solved

Convert dynamic URL's with wget to .html files

Posted on 2011-03-22
4
Medium Priority
?
662 Views
Last Modified: 2012-05-11
Hello,

I'm have a problem when I use wget to download/archive a webpage.

If I download "http://example.com" it works fine also if I download "http://example.com/page.html" that's OK too.

My problem is when I have a URL something like this:

"http://example.com/page.php?id=99"
OR
"http://example.com/index.html?hpt=T1"

These download fine but when I browse to them the page that shows is the HTML code not the browser rendered version.

So the question is how can I force all pages to become .htm or .html files

Here is my code:

<?php

$site = 'http://example.com/index.php?id=680';

$rnd1 = rand(100, 9999);
$rnd2 = rand(100, 9999);

mkdir("/home/USER/public_html/results/". $rnd1 . "/", 0777);
mkdir("/home/USER/public_html/results/". $rnd1 . "/". $rnd2 ."/", 0777);

exec("wget -e robots=off --limit-rate=250k -F -P /home/USESR/public_html/results/". $rnd1 ."/". $rnd2 ."/"." -p -k ". $site ."");

?> 

Open in new window



Thanks for the help!
0
Comment
Question by:jambla
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
4 Comments
 
LVL 5

Expert Comment

by:tsmgeek
ID: 35194479
im guessing the problem you are having is the files do not actualy have .html on the end but instead its got the query params concatinated on the end, you need to change this or append .html to the end of every file

personaly i would use curl to get the page then save it into a file that i name myself
0
 

Author Comment

by:jambla
ID: 35196152
Hello tsmgeek,

Thanks for your response.

im guessing the problem you are having is the files do not actualy have .html on the end but instead its got the query params concatinated on the end, you need to change this or append .html to the end of every file

Yeah, I'm pretty sure that's the problem.  Which is the main point of my questions; how do I do this?

personaly i would use curl to get the page then save it into a file that i name myself

Yeah, I prefer cURL also, my big problem with curl is I was only able to save the html but I was not able to save the css, images, js etc...  I am not partial to using wget so if you know how to do what I need using curl or any other web language (except .asp/.net) than I'm ok with that.

0
 

Accepted Solution

by:
jambla earned 0 total points
ID: 35197493
I managed to find the answer.  Using a -E in my wget statement will force a non-html extension to be one.
0
 

Author Closing Comment

by:jambla
ID: 35230010
I found my own solution.
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Build an array called $myWeek which will hold the array elements Today, Yesterday and then builds up the rest of the week by the name of the day going back 1 week.   (CODE) (CODE) Then you just need to pass your date to the function. If i…
Since pre-biblical times, humans have sought ways to keep secrets, and share the secrets selectively.  This article explores the ways PHP can be used to hide and encrypt information.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
The viewer will learn how to dynamically set the form action using jQuery.
Suggested Courses

618 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question