• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 441
  • Last Modified:

regex help

Hello,
For a report I need to count every rererrer in an apache log that begin with "http://"
All others I need to account for as "N/A".
I am trying to use search/replace with no luck.
Anyone have any ideas?

Here is a snippet of the log:
---------------------------------------------------------------
-
http://blizzard.ist.una.edu/~ppickett/project/research.html
http://blizzard.ist.una.edu/~jhperez/a4/index.html
-
-
-
http://myuno.una.edu/webapps/blackboard/content/courseMenu.jsp?course_id=_147588_1&mini=Y
-
http://blizzard.ist.una.edu/~lschaller/a5/
-
http://blizzard.ist.una.edu/~jdfitzpatrick/a4/otherpage.html
http://blizzard.ist.una.edu/~handersen/a5/index.html
-
 -----------------------------------------------------------------------------

The report will list the most hits, I need the top hits to display "N'A"

 Hits  %-age Resource
 -----  -----   -----
    56  55.45       -
     1   0.99 http://blizzard.ist.una.edu/1300-1-xhtml/
     1   0.99 http://blizzard.ist.una.edu/1300-2-css/
     1   0.99 http://blizzard.ist.una.edu/images/style.css
     1   0.99 http://blizzard.ist.una.edu/~adubey/project/al

Here is my attempt:
my $string7 = join('',@ref);
foreach $string7 (@ref)
{
   #$string =~ s/^http:\/\/./N\/A/mg;
   $string =~ s/-/N\/A/mg;
   print "$string7\n";
   
   $refer{$string7}++;
   $ref_tot++;
}

Open in new window

0
fac66
Asked:
fac66
  • 4
  • 3
  • 2
1 Solution
 
APNFSSCCommented:
Line 5 has $string and not $string7
0
 
sjklein42Commented:

s/^http\:.*$/N\/A/ig;

Open in new window

0
 
sjklein42Commented:
Sorry, this is what I was trying to post.  It got away from me.
This replaces your line 5.  It should replace non- http:// lines with N/A.
Is this what you're trying to do?

if ( ! ( $string =~ /^http\:\/\// ) ) { $string = 'N/A'; }

Open in new window

0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
APNFSSCCommented:
I think that the code is fine except for the typo?

So long as the sample provided is a true representation of the actual data
0
 
sjklein42Commented:
APNFSSC - I think you're right.
0
 
fac66Author Commented:
Sweet.. that worked!
Thank you

Hits  %-age Resource
 -----  -----   -----
    56  55.45     N/A
0
 
APNFSSCCommented:
fac66: did you try my solution above for your first issue?
0
 
sjklein42Commented:
Try this:

while ( <> )
{
    if ( /^http\:\/\/([^\/]+)/ )
    {
        $refCount{$1}++;
    }
}

foreach $domain (keys(%refCount))
{
    push @lines, sprintf("%6d", $refCount{$domain}) . "\t" . $domain . "\n";
}

print sort {b <=> a} @lines;

Open in new window

0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

  • 4
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now