• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 176
  • Last Modified:

remove content on string

hi all

  i would like to remove some content on string that i read from file

  there is in general the line :

text here <a href="http://somelink.com/somedir/">some text </a> text here
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here

 and i would like to keep only URL like
http://somelink.com/
or
http://somelink.com/somedir/
or
http://somelink.com/somedir/page.html

  can you help me..?
0
popy
Asked:
popy
1 Solution
 
clockwatcherCommented:
($line) = $line =~ m#"(http://.*?)"#;
0
 
jhurstCommented:
I suspect that:

$line=~ s/<a href=\"([^\"]+)>[^<]+<\/a>.*$/$1/mg;

may be closer.
0
 
smiskCommented:
Not to beat a dead horse here, but if there are multiple links on a line like the following :

text here <a href="http://somelink.com/somedir/">some test </a> text here <a href="http://www.boston.com">asdf</a>
or
text here <a href="http://somelink.com/">some text </a> text here
or
text here <a href="http://somelink.com/somedir/page.html">some text </a> text here

you should try these modifications to clockwatcher's :

#! /usr/local/bin/perl

while($line = <>) {

    @matches = $line =~ m#"(http://[^"]*)"#gi;
    foreach $item (@matches) { print "$item\n"; }

}
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

Join & Write a Comment

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now