Link to home
Start Free TrialLog in
Avatar of hadrons
hadrons

asked on

Feeding URLs from a file to a UserAgent script/Checking if a URL is live

I need a script that will take take urls in a file and check to see if the URL is live or not. I came up with this:


#!/usr/local/bin/perl
use LWP::UserAgent;
my $ua = LWP::UserAgent->new;
$ua->agent("MyApp/0.1 ");


open (INPUT, '<url.txt');
open (OUTPUT, '>url.html');

 while (<INPUT>) {
       chomp;

my $req = HTTP::Request->new(POST => '$_');
my $res = $ua->request($req);

  # Check the outcome of the response
  if ($res->is_success) {
      print $res->content;
  }
  else {
      print $res->status_line, "\n";
  }

 }
 
 print OUTPUT;

 close (INPUT);
 
However when I run it I get a "400 URL must be absolute". When I plug the URL in directly into my $req = HTTP::Request->new(POST => 'www....'); it works fine, so its not my connection.
ASKER CERTIFIED SOLUTION
Avatar of ozo
ozo
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of hadrons
hadrons

ASKER

Excellent ... I have just additional questions:

1) is there a function that delays the next URL request so I don't hammer their servers too hard (not that they don't deserve it with all the dead URLs they sent, but still I want to be a citizen)

2) the output isn't writing to the file handlers (I can capture it by redirecting on the command line, but I prefer the file handlers.)
1) sleep
2) print OUTPUT $res->status_line, "\n";