popy
asked on
remove content on string
hi all
i would like to remove some content on string that i read from file
there is in general the line :
text here <a href="http://somelink.com/somedir/">some text </a> text here
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here
and i would like to keep only URL like
http://somelink.com/
or
http://somelink.com/somedir/
or
http://somelink.com/somedir/page.html
can you help me..?
i would like to remove some content on string that i read from file
there is in general the line :
text here <a href="http://somelink.com/somedir/">some text </a> text here
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here
and i would like to keep only URL like
http://somelink.com/
or
http://somelink.com/somedir/
or
http://somelink.com/somedir/page.html
can you help me..?
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Not to beat a dead horse here, but if there are multiple links on a line like the following :
text here <a href="http://somelink.com/somedir/">some test </a> text here <a href="http://www.boston.com">asdf</a>
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here
you should try these modifications to clockwatcher's :
#! /usr/local/bin/perl
while($line = <>) {
@matches = $line =~ m#"(http://[^"]*)"#gi;
foreach $item (@matches) { print "$item\n"; }
}
text here <a href="http://somelink.com/somedir/">some test </a> text here <a href="http://www.boston.com">asdf</a>
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here
you should try these modifications to clockwatcher's :
#! /usr/local/bin/perl
while($line = <>) {
@matches = $line =~ m#"(http://[^"]*)"#gi;
foreach $item (@matches) { print "$item\n"; }
}
$line=~ s/<a href=\"([^\"]+)>[^<]+<\/a>
may be closer.