Solved

stip out URL in detail view

Posted on 2003-11-10
17
468 Views
Last Modified: 2010-03-04

I need to remove URL which begins with http or www in the field $fields[12].

Below is a only example to remove email address.
I need for above in same principle.
$fields[12] =~ s/[a-zA-Z0-9\-]+@[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}/ /;
0
Comment
Question by:tilmes
  • 6
  • 6
  • 4
  • +1
17 Comments
 
LVL 3

Expert Comment

by:BioI
ID: 9714119
next command removes www. ["dot" after www included] or http:// [: en // included] or http://www]

$fields[12]=~ s/^[www\.|http:\/\/|http:\/\/www\.]+//;

0
 
LVL 1

Expert Comment

by:gourav_jain
ID: 9714150
Hi Tilmes
I hope this will work for you....

$fields[12] =~ s/^http|www.*//g;

Gourav Jain
0
 

Author Comment

by:tilmes
ID: 9714186
Hello Bio,

It does not work and furthermore
i need to remove whole URL inc. www or http
0
 

Author Comment

by:tilmes
ID: 9714227
Hello gourav_jain
the URL e.g.
http://webspace.abc.at
does not work.
0
 

Author Comment

by:tilmes
ID: 9714338
I added with
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
and it removes all the text behind http or www.
Can limit to only URL?
0
 
LVL 3

Accepted Solution

by:
BioI earned 50 total points
ID: 9715091
Strange that my solution didn't work, because I tested it on a few examples.
But when you want to delete the complete URL, this should work:

$fields[12]=~ s/^[www\.|http:\/\/|http:\/\/www\.]+\S+//;

Now you delete everything that comes after www. or http:// until perl finds a space [\S+ means: everything but a space].  When I test this e.g. on www.experts-exchange.com/Programming/Programming_Languages/Perl/Q_20792860.html test1 test2
the output is
test1 test2

Hope this works...
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9715982
BioI's first method should have worked, but here's a condensed version that also works.

$fields[12] =~ s/^(?=www|http))\S+//i;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9716030
If you want to be a little more precise:

$fields[12] =~ s!^(?=www\.|http://))\S+!!i;
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 

Author Comment

by:tilmes
ID: 9716239
I don't know why but those are not working
0
 
LVL 3

Expert Comment

by:BioI
ID: 9716414
the solution of FishMonger also doesn't work?
what sort of output do you get when you use these substitution syntaxes?
can you give some examples of url's and the output using the above "solutions"...
0
 

Author Comment

by:tilmes
ID: 9716456
I had a some kind of error with $fields[12] =~ s/^(?=www|http))\S+//i;
And what really worked is still
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9717133
Your regex says to match NOTHING at the beginning of the string OR http:// OR www. anywhere in the string.  So, if yours works and ours doesn't, it's because you have something other than www or http at the beginning of the string.  You could remove the beginning of string line anchor.

Here's a test script that demistrates this; just uncomment the regex that you want to test

@fields = (
           'www.experts-exchange.com/ test1 test2',
           'oldlook.experts-exchange.com/ test3 test4',
           'http://www.experts-exchange.com/ test5 test6',
           "\t\thttp://www.experts-exchange.com/ test7 test8",
          );

for $i (0..$#fields) {
#   $fields[$i] =~ s/^[www\.|http:\/\/|http:\/\/www\.]+//; # BioI's first regex
#   $fields[$i] =~ s/^[www\.|http:\/\/|http:\/\/www\.]+\S+//; # BioI's second regex
#   $fields[$i] =~ s/^(http|www).*//g;  # gourav_jain's corrected regex
#   $fields[$i] =~ s/^(?=www|http)\S+//i; # FishMonger's regex
#   $fields[$i] =~ s/^|http:\/\/.*|www.*//g; # your regex
   print "$fields[$i]\n";
}

BioI and I both thought you wanted to remove just the url, but if you want the entire string removed, we need to modify the end of the regex or use splice to delete the array element.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9717167
Take note of the second array element.  That's a valid url (not counting the test test).  Do you want to be able to match and delete those types of urls?
0
 

Author Comment

by:tilmes
ID: 9717701
Hello FishMonger,

i appreciate that you wrote very detail. i can not answer to your quetion.
and copyed the script i have below.

for ($i = 0;$i <= 21;$i++) {
      $fields[$i] =~ s/~p~/\|/g;
      $fields[$i] =~ s/~nl~/<br>/g;
        if ($fields[$i] eq "") { $fields[$i] = "&#151;"; }
}

$fields[18] =~ s/&#151;//g;
# strip most email addresses.
$fields[12] =~ s/[a-zA-Z0-9\-]+@[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,4}/ /;

# strip the number beginning w/ a 0.
$fields[12] =~ s/(^|\D)0+\d*\D{0,1}\d*/$1/g;
$fields[12] =~ s/^|http:\/\/.*|www.*//g;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9718063
I'm not sure what you're asking me, but will this help?

These three regex's do the exact same thing.

$fields[12] =~ s/^|http:\/\/.*|www.*//g;
$fields[12] =~ s/http:\/\/|www.*//g;
$fields[12] =~ s!http://|www.*!!g;

Notice that on the 2nd & 3rd regex I've removed ^| and on the 3rd regex I used a different delimiter so that the forward slashes don't need to be escaped.

The initialization of the for loop can be made more readable if you change it to:

for $i (0..21)

I'd need to look at more of your script to be sure, but I think it might be better to use a hash instead of the array.
0
 
LVL 3

Expert Comment

by:BioI
ID: 9721583
and maybe explain what you want to do with the url:
- do you want to delete the complete line when there is a url inside? or do you only want to delete the url?
- does the www or http have to appear in the beginning of your line or everywhere in the line?

p.s. indeed, using a hash sounds good...
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 9723630
tilmes,

There seems to a fair amount of obfuscation going on in the snippets of your scripts (in each of your questions) that you've posted.  It might be helpful (for you and us) if you give us a more complete explaination of your project.  We try to do the best we can to answer your questions, but it seams that once you put these fragments together, your script gets more obfuscated.
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Perl Scripting from a shell script with if and for 4 82
Question about @INC variable in perl 1 55
Export Variables in Perl 3 43
html form to write data to csv 24 94
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
This video demonstrates how to create an example email signature rule for a department in a company using CodeTwo Exchange Rules. The signature will be inserted beneath users' latest emails in conversations and will be displayed in users' Sent Items…

948 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

23 Experts available now in Live!

Get 1:1 Help Now