Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17

x
?
Solved

Perl hash help with parsing apache logs

Posted on 2011-03-05
5
Medium Priority
?
473 Views
Last Modified: 2012-05-11
Hello,
I am reading an apache file and parsing the data.
I need to display number of accesses per hostname, number of accesses, and
a percentage of the total accesses that each host accounted for as follows.

I am having a problem calculating the percentage ot total access.

Thanks in advance

   Hits   %-age    Resource
 -----      -----        -----
     7       1            h10.163.23.98.static.ip.windstream.net
     6       1            ip98-179-8-48.om.om.cox.net
     4       1            ip98-168-193-160.om.om.cox.net
     3       1            ip68-110-22-151.om.om.cox.net
 
#reading in file
my ($file) = @ARGV;
open (LOG, $file);
 
 my ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent);
 
#hash for hits
my %Hits;

while ( my $line=<LOG>) {
   ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent) = $line =~
          m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+) (\S+) ([^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/;

 #Counting number of hits per host, &Hnames is a subroutine that calls $host and does a reverse dns lookup 
$Hits{&Hnames}++

}
 
 
#------------------------------
       print "=" x 78,"\n";
       print "HOSTNAMES\n";
       print "=" x 78,"\n";
       printf "%6s %4s %s\n", "Hits", "%-age", "Recourse";
       printf "%6s %4s %s\n", "-----", "-----","-----";

# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
     
        my $num += $Hits{$key};
        my $perc = $Hits{$key}/$num;
      
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
    
}

Open in new window

0
Comment
Question by:fac66
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
5 Comments
 
LVL 26

Accepted Solution

by:
wilcoxon earned 2000 total points
ID: 35044677
This should do what you want.  If not, let me know where you are seeing an issue...

The problem is that you need the total number of hits prior to looping through the keys to do the output.
#reading in file
my ($file) = @ARGV;
open (LOG, $file);
 
 my ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent);
 
#hash for hits
my (%Hits, $ttl);

while ( my $line=<LOG>) {
   ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent) = $line =~
          m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+) (\S+) ([^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/;

 #Counting number of hits per host, &Hnames is a subroutine that calls $host and does a reverse dns lookup 
$Hits{&Hnames}++
$ttl++;

}
 
 
#------------------------------
       print "=" x 78,"\n";
       print "HOSTNAMES\n";
       print "=" x 78,"\n";
       printf "%6s %4s %s\n", "Hits", "%-age", "Recourse";
       printf "%6s %4s %s\n", "-----", "-----","-----";

# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
     
        my $perc = $Hits{$key}/$ttl;
      
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
    
}

Open in new window

0
 

Author Comment

by:fac66
ID: 35044817
Thanks for the response!

This is what i get..

Hits   %-age   Resource
 -----   -----     -----
     7     0      h10.163.23.98.static.ip.windstream.net
     6     0      ip98-179-8-48.om.om.cox.net
     4     0      ip98-168-193-160.om.om.cox.net
0
 

Author Comment

by:fac66
ID: 35044850
Actually it does work:

=============================================================================
  Hits   %-age    Resource
 -----   -----    -----
     7   0.69   h10.163.23.98.static.ip.windstream.net
     6   0.59   ip98-179-8-48.om.om.cox.net
     4   0.40   ip98-168-193-160.om.om.cox.net
     3   0.30   ip68-110-22-151.om.om.cox.net

I had to chage the printf to reflect floating point.

printf "%6d %4.2f %5s\n", $Hits{ $key }, $perc, $key;

Open in new window

0
 

Author Comment

by:fac66
ID: 35044868

One other question please..

Is there a way I can get the same results but sorting alphabetically?
I would have to sort by the value rather than key.
What I have tried results in losing the hits count.
# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {     
        my $perc = $Hits{$key}/$ttl;     
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;    
}

Open in new window

0
 
LVL 84

Expert Comment

by:ozo
ID: 35045231
foreach my $key ( sort keys %Hits ) {
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I have been reconstructing a PHP-based application that has grown into a full blown interface system over the last ten years by a developer that has now gone into business for himself building websites. I am not incredibly fond of writing PHP code o…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

722 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question