[2 days left] What’s wrong with your cloud strategy? Learn why multicloud solutions matter with Nimble Storage.Register Now

x
?
Solved

grab a portion of remote webpage

Posted on 2003-11-13
4
Medium Priority
?
651 Views
Last Modified: 2012-08-14
I know this question has been asked a billion times by now, but I can't find an answer.
I've already done what I need in PHP, but I need it in Perl so I can run it from cron.
I need to create a text file that contains the values used on a dropdown menu on a remote
website. The values sporadically change, so this would be ran once a week, to ensure we
have a correct list.  

#!/usr/bin/perl
use LWP::Simple;
my $source=get('http://somedomain.com/file.php?var=var');

if($source){
# I only need a portion of the page, everything between the
# 1st set of SELECT tags
##  <SELECT name="dropdown"> <OPTION value="All" selected>All
##  <OPTION value="1">Option 1
##   this is a really long list
##  <OPTION value="255">Option 255
##  </SELECT>
 print "$source"; } else {die "$!";}

How can I get $source to only contain only the portion I need?
0
Comment
Question by:dewed
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
4 Comments
 
LVL 18

Expert Comment

by:kandura
ID: 9744165
I suggest you use HTML::TreeBuilder or one of the many other HTML parsing modules that come with Perl.

Do something like:

$tree = HTML::TreeBuilder->new_from_content($source);
$select = $tree->look_down('_tag', 'select');
print $select->as_HTML;

This will print the first SELECT on the page (with the <select> tag; loop over $select->content_list do print only the options).

See the documentation for HTML::TreeBuilder and HTML::Element for details.
0
 
LVL 3

Assisted Solution

by:prady_21
prady_21 earned 100 total points
ID: 9745388
#!/usr/bin/perl
### A program using sockets

use IO::Socket;

$HOST="www.yoursite.com";
$URL_VAL="path/to/the/page";

$sock = new IO::Socket::INET ( PeerAddr  => "${HOST}",
                               PeerPort  =>  80,
                               Proto     => 'tcp',
                               Timeout   => 10,
                             );
die "Socket could not be created $!\n" unless $sock;

     print $sock "GET ${URL_VAL} HTTP/1.0\r\n";
     print $sock "Host: ${HOST}\r\n";
     print $sock "Accept: */*\r\n";
     print $sock "Connection: Keep-Alive\n\n";
     while($line = <$sock>) {
       if ( $line =~ /<SELECT name="dropdown">/ ) {
          until ( ($line = <$sock>) =~ m/<\/SELECT>/ ) {
             $text .= $line;
          }
          last;
       }
     }
     print "$text\n";
  exit 0;
0
 
LVL 1

Accepted Solution

by:
OKSD earned 100 total points
ID: 9749535
Does the cron not work with PHP?

-OKSD
0
 

Author Comment

by:dewed
ID: 9751956
Does the cron not work with PHP?
.. ya know.. I don't know.. I haven't tried command line php since we were upgraded.. maybe they turned it on this time.

hehe.. cool!  couldn't run PHP command line before  
#!/usr/bin/php -q
<?php
print "hello world";
?>
shouldn't be a problem now  thanks!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
Checking the Alert Log in AWS RDS Oracle can be a pain through their user interface.  I made a script to download the Alert Log, look for errors, and email me the trace files.  In this article I'll describe what I did and share my script.
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

649 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question