schworak
asked on
Testing a URL
I have a script in mind that will help me keep track of some of my links on my web page. I know how to do just about everything I am after. Only one problem. I need a little part of the script to test the URL I give it and tell me "GOOD" or "BAD".
For example, I could call a subroutine $good=&Test_URL("http:://www.test.com") and $good would contain 0 if the site responds with a 404 error, and 1 if anything else is returned.
Can you help with this subroutine please?
For example, I could call a subroutine $good=&Test_URL("http:://www.test.com") and $good would contain 0 if the site responds with a 404 error, and 1 if anything else is returned.
Can you help with this subroutine please?
You could search for the word 404 in a response to a get from port 80 on the target server, however, this page contains a 404, in fact it contains two. How do you propose to distinguish these from real 404's (that's three now).
if you were to create a routing I would you a telnet module as a starting point. Take the URL for example www.company.com/test.html split it into host and directory/file.
Now telnet host on port 80 and issue a get directory/file.
Now you will have to parse the response and decide what determines an error. However some 404 pages don't say much about them been a 404 on them - even though to be compliant the must.
Should take ~3/4 hours of playing look for telnet.pm to help with the socket stuff.
Now telnet host on port 80 and issue a get directory/file.
Now you will have to parse the response and decide what determines an error. However some 404 pages don't say much about them been a 404 on them - even though to be compliant the must.
Should take ~3/4 hours of playing look for telnet.pm to help with the socket stuff.
telnet??? Try perldoc lwpcook
use LWP::Simple;
$doc = get 'http://www.test.com";
will get you the source of what you're trying to find.
use LWP::Simple;
($content_type, $document_length, $modified_time, $expires, $server) = head("http://www.test.com");
will help you get around the 404 problem, too.
use LWP::Simple;
$doc = get 'http://www.test.com";
will get you the source of what you're trying to find.
use LWP::Simple;
($content_type, $document_length, $modified_time, $expires, $server) = head("http://www.test.com");
will help you get around the 404 problem, too.
ASKER
The telnet idea would work but is not reliable or predictable.
The LWP idea was the one I was looking for but couldn't find the docs on how to do it. Thanks.
The points will be awarded to b2pi if you just submit a message as an answer. Your answer is 100% what I am after. Thanks!
The LWP idea was the one I was looking for but couldn't find the docs on how to do it. Thanks.
The points will be awarded to b2pi if you just submit a message as an answer. Your answer is 100% what I am after. Thanks!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks! The code works just great!