Solved

Extracting links from a string

Posted on 2002-06-30
2
156 Views
Last Modified: 2010-03-05
Ok theres regular expression that will extract links from a url or file such as
#!/usr/bin/perl -n -00
while ( /<\s*A\s+HREF\s*=\s*(["'])(.*?)\1.*?>/gi ) {
         print "$2\n";
}

I need a sub routine that I can pass a string and extract the links from a string. Is there anyway I can adapt Tom Christiansen work, or does anyone know how I can do this?
Cheers
Ryan
0
Comment
Question by:NoFrills
2 Comments
 
LVL 84

Accepted Solution

by:
ozo earned 50 total points
ID: 7120858
sub extractlinks{
    local $_ = shift;
    my @links = ();
    while ( /<\s*A\s+HREF\s*=\s*(["'])(.*?)\1.*?>/gi ) {
        push @links,$2;
    }
    return @links;
0
 
LVL 3

Expert Comment

by:DVB
ID: 7134631
use HTML::Parse;
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Filtering a file to table 9 99
Port 80 requests 16 87
syslog unix file 20 68
Regular Expression for URL 10 91
Email validation in proper way is  very important validation required in any web pages. This code is self explainable except that Regular Expression which I used for pattern matching. I originally published as a thread on my website : http://www…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
In this video I am going to show you how to back up and restore Office 365 mailboxes using CodeTwo Backup for Office 365. Learn more about the tool used in this video here: http://www.codetwo.com/backup-for-office-365/ (http://www.codetwo.com/ba…

867 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

12 Experts available now in Live!

Get 1:1 Help Now