stip out URL in detail view


I need to remove URL which begins with http or www in the field $fields[12].

Below is a only example to remove email address.
I need for above in same principle.
$fields[12] =~ s/[a-zA-Z0-9\-]+@[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}/ /;
tilmesAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

BioICommented:
next command removes www. ["dot" after www included] or http:// [: en // included] or http://www]

$fields[12]=~ s/^[www\.|http:\/\/|http:\/\/www\.]+//;

0
gourav_jainCommented:
Hi Tilmes
I hope this will work for you....

$fields[12] =~ s/^http|www.*//g;

Gourav Jain
0
tilmesAuthor Commented:
Hello Bio,

It does not work and furthermore
i need to remove whole URL inc. www or http
0
Keep up with what's happening at Experts Exchange!

Sign up to receive Decoded, a new monthly digest with product updates, feature release info, continuing education opportunities, and more.

tilmesAuthor Commented:
Hello gourav_jain
the URL e.g.
http://webspace.abc.at
does not work.
0
tilmesAuthor Commented:
I added with
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
and it removes all the text behind http or www.
Can limit to only URL?
0
BioICommented:
Strange that my solution didn't work, because I tested it on a few examples.
But when you want to delete the complete URL, this should work:

$fields[12]=~ s/^[www\.|http:\/\/|http:\/\/www\.]+\S+//;

Now you delete everything that comes after www. or http:// until perl finds a space [\S+ means: everything but a space].  When I test this e.g. on www.experts-exchange.com/Programming/Programming_Languages/Perl/Q_20792860.html test1 test2
the output is
test1 test2

Hope this works...
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
FishMongerCommented:
BioI's first method should have worked, but here's a condensed version that also works.

$fields[12] =~ s/^(?=www|http))\S+//i;
0
FishMongerCommented:
If you want to be a little more precise:

$fields[12] =~ s!^(?=www\.|http://))\S+!!i;
0
tilmesAuthor Commented:
I don't know why but those are not working
0
BioICommented:
the solution of FishMonger also doesn't work?
what sort of output do you get when you use these substitution syntaxes?
can you give some examples of url's and the output using the above "solutions"...
0
tilmesAuthor Commented:
I had a some kind of error with $fields[12] =~ s/^(?=www|http))\S+//i;
And what really worked is still
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
0
FishMongerCommented:
Your regex says to match NOTHING at the beginning of the string OR http:// OR www. anywhere in the string.  So, if yours works and ours doesn't, it's because you have something other than www or http at the beginning of the string.  You could remove the beginning of string line anchor.

Here's a test script that demistrates this; just uncomment the regex that you want to test

@fields = (
           'www.experts-exchange.com/ test1 test2',
           'oldlook.experts-exchange.com/ test3 test4',
           'http://www.experts-exchange.com/ test5 test6',
           "\t\thttp://www.experts-exchange.com/ test7 test8",
          );

for $i (0..$#fields) {
#   $fields[$i] =~ s/^[www\.|http:\/\/|http:\/\/www\.]+//; # BioI's first regex
#   $fields[$i] =~ s/^[www\.|http:\/\/|http:\/\/www\.]+\S+//; # BioI's second regex
#   $fields[$i] =~ s/^(http|www).*//g;  # gourav_jain's corrected regex
#   $fields[$i] =~ s/^(?=www|http)\S+//i; # FishMonger's regex
#   $fields[$i] =~ s/^|http:\/\/.*|www.*//g; # your regex
   print "$fields[$i]\n";
}

BioI and I both thought you wanted to remove just the url, but if you want the entire string removed, we need to modify the end of the regex or use splice to delete the array element.
0
FishMongerCommented:
Take note of the second array element.  That's a valid url (not counting the test test).  Do you want to be able to match and delete those types of urls?
0
tilmesAuthor Commented:
Hello FishMonger,

i appreciate that you wrote very detail. i can not answer to your quetion.
and copyed the script i have below.

for ($i = 0;$i <= 21;$i++) {
      $fields[$i] =~ s/~p~/\|/g;
      $fields[$i] =~ s/~nl~/<br>/g;
        if ($fields[$i] eq "") { $fields[$i] = "&#151;"; }
}

$fields[18] =~ s/&#151;//g;
# strip most email addresses.
$fields[12] =~ s/[a-zA-Z0-9\-]+@[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}/ /;

# strip the number beginning w/ a 0.
$fields[12] =~ s/(^|\D)0+\d*\D{0,1}\d*/$1/g;
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
0
FishMongerCommented:
I'm not sure what you're asking me, but will this help?

These three regex's do the exact same thing.

$fields[12] =~ s/^|http:\/\/.*|www.*//g;
$fields[12] =~ s/http:\/\/|www.*//g;
$fields[12] =~ s!http://|www.*!!g;

Notice that on the 2nd & 3rd regex I've removed ^| and on the 3rd regex I used a different delimiter so that the forward slashes don't need to be escaped.

The initialization of the for loop can be made more readable if you change it to:

for $i (0..21)

I'd need to look at more of your script to be sure, but I think it might be better to use a hash instead of the array.
0
BioICommented:
and maybe explain what you want to do with the url:
- do you want to delete the complete line when there is a url inside? or do you only want to delete the url?
- does the www or http have to appear in the beginning of your line or everywhere in the line?

p.s. indeed, using a hash sounds good...
0
FishMongerCommented:
tilmes,

There seems to a fair amount of obfuscation going on in the snippets of your scripts (in each of your questions) that you've posted.  It might be helpful (for you and us) if you give us a more complete explaination of your project.  We try to do the best we can to answer your questions, but it seams that once you put these fragments together, your script gets more obfuscated.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Perl

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.