?
Solved

grab a portion of remote webpage

Posted on 2003-11-13
4
Medium Priority
?
652 Views
Last Modified: 2012-08-14
I know this question has been asked a billion times by now, but I can't find an answer.
I've already done what I need in PHP, but I need it in Perl so I can run it from cron.
I need to create a text file that contains the values used on a dropdown menu on a remote
website. The values sporadically change, so this would be ran once a week, to ensure we
have a correct list.  

#!/usr/bin/perl
use LWP::Simple;
my $source=get('http://somedomain.com/file.php?var=var');

if($source){
# I only need a portion of the page, everything between the
# 1st set of SELECT tags
##  <SELECT name="dropdown"> <OPTION value="All" selected>All
##  <OPTION value="1">Option 1
##   this is a really long list
##  <OPTION value="255">Option 255
##  </SELECT>
 print "$source"; } else {die "$!";}

How can I get $source to only contain only the portion I need?
0
Comment
Question by:dewed
4 Comments
 
LVL 18

Expert Comment

by:kandura
ID: 9744165
I suggest you use HTML::TreeBuilder or one of the many other HTML parsing modules that come with Perl.

Do something like:

$tree = HTML::TreeBuilder->new_from_content($source);
$select = $tree->look_down('_tag', 'select');
print $select->as_HTML;

This will print the first SELECT on the page (with the <select> tag; loop over $select->content_list do print only the options).

See the documentation for HTML::TreeBuilder and HTML::Element for details.
0
 
LVL 3

Assisted Solution

by:prady_21
prady_21 earned 100 total points
ID: 9745388
#!/usr/bin/perl
### A program using sockets

use IO::Socket;

$HOST="www.yoursite.com";
$URL_VAL="path/to/the/page";

$sock = new IO::Socket::INET ( PeerAddr  => "${HOST}",
                               PeerPort  =>  80,
                               Proto     => 'tcp',
                               Timeout   => 10,
                             );
die "Socket could not be created $!\n" unless $sock;

     print $sock "GET ${URL_VAL} HTTP/1.0\r\n";
     print $sock "Host: ${HOST}\r\n";
     print $sock "Accept: */*\r\n";
     print $sock "Connection: Keep-Alive\n\n";
     while($line = <$sock>) {
       if ( $line =~ /<SELECT name="dropdown">/ ) {
          until ( ($line = <$sock>) =~ m/<\/SELECT>/ ) {
             $text .= $line;
          }
          last;
       }
     }
     print "$text\n";
  exit 0;
0
 
LVL 1

Accepted Solution

by:
OKSD earned 100 total points
ID: 9749535
Does the cron not work with PHP?

-OKSD
0
 

Author Comment

by:dewed
ID: 9751956
Does the cron not work with PHP?
.. ya know.. I don't know.. I haven't tried command line php since we were upgraded.. maybe they turned it on this time.

hehe.. cool!  couldn't run PHP command line before  
#!/usr/bin/php -q
<?php
print "hello world";
?>
shouldn't be a problem now  thanks!
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question