HTML:Parser cgi script (part 2)

Posted on 2003-03-15
Medium Priority
Last Modified: 2010-03-05
My previous question asked for:-

1. The first 8 characters of the url ie: ' awebsite ' or everything between www. and .com (or whatever after . ie .net .co.uk .org) if the address has less than 8 characters. If the address is a repeat then add number
(incremented) ' awebsite_1 '.

answer accepted (snip):-

while( <L> ){
   ($u) = m'//(?:www\.)?([^.]{0,8})';
   if( $n = $n{$u}++ ){ $u .= "_$n"; }

I need the code to return only characters (letters of the alphabet) & numbers ( 0 - 9 )so if there were any other characters in the web address such as a forward slash ( / )it would ignore those.

Question by:malkie
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions

Accepted Solution

trevorw earned 500 total points
ID: 8143102

You can replace all non-alphanumeric characters in the url's as follows:

while (<L>) {
  ($u) = m'//(?:www\.)?([^.]{0,8})';
  $u =~ s/\W//g;
  if( $n = $n{$u}++ ){ $u .= "_$n"; }

There is probably a way to incorporate this into the original regexp but I'm not too hot on them :)

Hope this helps.

Best regards,

Author Comment

ID: 8146696
Thanks Trevor works okay for me God Bless

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans
Suggested Courses

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question