Link to home
Start Free TrialLog in
Avatar of mistadontplay
mistadontplay

asked on

Trying to parse html file

I'm trying to run a book example and I get the following error.

E:\>perl parser.pl
Can't locate HTML/Tagset.pm in @INC (@INC contains: E:/ind/perl/lib E:/ind/perl/
site/lib .) at E:/ind/perl/site/lib/HTML/LinkExtor.pm line 31.
BEGIN failed--compilation aborted at E:/ind/perl/site/lib/HTML/LinkExtor.pm line
 31.
Compilation failed in require at parser.pl line 5.
BEGIN failed--compilation aborted at parser.pl line 5.

sourcecode

#!e:/ind/perl/bin/perl -w

use strict;
use LWP::UserAgent;
use HTML::LinkExtor;
use URI::URL;

my $url = URI::URL->new('http://www.perl.com/');
my $base_url;

# Create new UserAgent object (browser)
my $ua = LWP::UserAgent->new();

# Give our agent a name
$ua->agent("Mozilla/4.7");

# Create HTTP GET request
my $request = HTTP::Request->new(GET => $url);

# Execute HTTP request
my $response = $ua->request($request);

# Check success
if ($response->is_success && $response->content_type eq 'text/html') {
    # Request was successful and is html
    $base_url = $response->base();
    print "Base URL: $base_url\n";
    my $link_extor = HTML::LinkExtor->new(\&extract_links);
    $link_extor->parse($response->content);
} else {
    # Request failed - print response code and message
    print "Error getting document: ", $response->status_line, "\n";
}

sub extract_links {
    my ($tag, %attr) = @_;

    if ($tag eq 'a' or $tag eq 'img') {
        foreach my $key (keys %attr) {
            if ($key eq 'href' or $key eq 'src') {
                my $link_url = URI->new($attr{$key});
                my $full_url = $link_url->abs($base_url);
                print "LINK: $full_url\n";
            }
        }
    }
}
ASKER CERTIFIED SOLUTION
Avatar of fantasy1001
fantasy1001

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of Dave Cross
Dave Cross
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Nothing has happened on this question in over 2 months. It's time for cleanup!

My recommendation, which I will post in the Cleanup topic area, is to
split points between fantasy1001 and davorg.

PLEASE DO NOT ACCEPT THIS COMMENT AS AN ANSWER!

jmcg
EE Cleanup Volunteer