?
Solved

Perl IO::Socket to read web pages....Do I need a new brain?

Posted on 2003-03-01
8
Medium Priority
?
551 Views
Last Modified: 2009-12-16
This gives me aboslutely nothing:(

use IO::Socket;
use strict;

# Define the variables;

        my $webport="80";
        my $webpage="http://www.samspade.org/";
        my $webserver="www.samspade.org";
        my $mx="";

my $socket = IO::Socket::INET->new
(
        Proto   => "tcp",
        PeerAddr=> $webserver,
        PeerPort=> $webport,
) or die "Error creating socket";

$socket->autoflush(1);

print $socket "GET $webpage HTTP/1.0\015\012\015\012";

while (<$socket>){
        print;
}

Whereas this gives me the full yahoo page:

use IO::Socket;
use strict;

# Define the variables;

        my $webport="80";
        my $webpage="http://www.yahoo.com/";
        my $webserver="www.yahoo.com";
        my $mx="";

my $socket = IO::Socket::INET->new
(
        Proto   => "tcp",
        PeerAddr=> $webserver,
        PeerPort=> $webport,
) or die "Error creating socket";

$socket->autoflush(1);

print $socket "GET $webpage HTTP/1.0\015\012\015\012";

while (<$socket>){
        print;
}

Note that the only bits changed are the $webpage and $webserver. If I change the webserver for the samspade site example $webserver to a valid proxy, then it works fine. Replacing with the ip address of the samspade webserver also fails.

My question is:

Why can't I connect to the samspade webserver directly using this method? I can directly telnet to the port 80 of the samspade.org webserver from exactly the same machine.

Please can someone else confirm that this also occurs on their machines as well.

Points are for confirming that the same problem exists from other machines, identifying the cause of the problem, and providing me a solution/alteration to the above script.

Partial points may be awarded seperately for progress on this problem.


0
Comment
Question by:pjedmond
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 5
  • 2
8 Comments
 
LVL 10

Expert Comment

by:rj2
ID: 8048061
Easier like this?

#!/usr/bin/perl
use LWP::Simple;
print get('http://www.yahoo.com');
0
 
LVL 22

Author Comment

by:pjedmond
ID: 8048109
Would be much easier:)....but unfortunately it needs to be done with IO::Socket
0
 
LVL 10

Expert Comment

by:rj2
ID: 8048208
Why is that?
There might still be an easier way to do this.
0
Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 22

Author Comment

by:pjedmond
ID: 8048494
2 reasons - The first being that the software is running on an Atmel embedded processor, so LWP::Simple doesn't seem to want to play...plus it uses up valuable memory. After getting the Operating system up and running with Perl, I've got about 4 MB of RAM left to play with. The second reason is that with LWP::Simple, I have to read in the whole page before I can process it, using up valuable processing resources.

I'm therefore exploring the capabilities of IO::Socket in order to see just how much I can do from this board. For example, we have got it to run a perl webserver, and simple local POP/SMTP.

I may get another memory chip for this board, but at just over $100 for the 32MB module, it's a little steep compared with the average PC. And as I can't get the IO:Socket version working on my PC either, I'm open to other suggestions. Yes I could do it in another language such as C etc, but I'd like to get it working in Perl....just for the hell of it I suppose;)
0
 
LVL 22

Author Comment

by:pjedmond
ID: 8048503
If you could try out the code I've printed out, and confirm whether or not there is a problem, then that would be helpful, then I'd really love to know why the IO:Socket is rejected by samspade...or perhaps it's due to the corect charachters sent after the GET request relating to different servers - I don't know...and I'm curious to find out:)
0
 
LVL 48

Accepted Solution

by:
Tintin earned 400 total points
ID: 8050923
I tried your code for samspade and it works fine.  Are you sure there weren't any network problems when you tried?
0
 
LVL 22

Author Comment

by:pjedmond
ID: 8051152
Could have been - It now works - Totally bizarre, so I'm going for the network problem. Thanks for testing:) Points are yours:)
0
 
LVL 22

Author Comment

by:pjedmond
ID: 8051156
0

Featured Post

Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question