Go Premium for a chance to win a PS4. Enter to Win

x
?
Solved

Perl hash help with parsing apache logs

Posted on 2011-03-05
5
Medium Priority
?
476 Views
Last Modified: 2012-05-11
Hello,
I am reading an apache file and parsing the data.
I need to display number of accesses per hostname, number of accesses, and
a percentage of the total accesses that each host accounted for as follows.

I am having a problem calculating the percentage ot total access.

Thanks in advance

   Hits   %-age    Resource
 -----      -----        -----
     7       1            h10.163.23.98.static.ip.windstream.net
     6       1            ip98-179-8-48.om.om.cox.net
     4       1            ip98-168-193-160.om.om.cox.net
     3       1            ip68-110-22-151.om.om.cox.net
 
#reading in file
my ($file) = @ARGV;
open (LOG, $file);
 
 my ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent);
 
#hash for hits
my %Hits;

while ( my $line=<LOG>) {
   ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent) = $line =~
          m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+) (\S+) ([^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/;

 #Counting number of hits per host, &Hnames is a subroutine that calls $host and does a reverse dns lookup 
$Hits{&Hnames}++

}
 
 
#------------------------------
       print "=" x 78,"\n";
       print "HOSTNAMES\n";
       print "=" x 78,"\n";
       printf "%6s %4s %s\n", "Hits", "%-age", "Recourse";
       printf "%6s %4s %s\n", "-----", "-----","-----";

# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
     
        my $num += $Hits{$key};
        my $perc = $Hits{$key}/$num;
      
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
    
}

Open in new window

0
Comment
Question by:fac66
  • 3
5 Comments
 
LVL 27

Accepted Solution

by:
wilcoxon earned 2000 total points
ID: 35044677
This should do what you want.  If not, let me know where you are seeing an issue...

The problem is that you need the total number of hits prior to looping through the keys to do the output.
#reading in file
my ($file) = @ARGV;
open (LOG, $file);
 
 my ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent);
 
#hash for hits
my (%Hits, $ttl);

while ( my $line=<LOG>) {
   ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent) = $line =~
          m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+) (\S+) ([^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/;

 #Counting number of hits per host, &Hnames is a subroutine that calls $host and does a reverse dns lookup 
$Hits{&Hnames}++
$ttl++;

}
 
 
#------------------------------
       print "=" x 78,"\n";
       print "HOSTNAMES\n";
       print "=" x 78,"\n";
       printf "%6s %4s %s\n", "Hits", "%-age", "Recourse";
       printf "%6s %4s %s\n", "-----", "-----","-----";

# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
     
        my $perc = $Hits{$key}/$ttl;
      
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
    
}

Open in new window

0
 

Author Comment

by:fac66
ID: 35044817
Thanks for the response!

This is what i get..

Hits   %-age   Resource
 -----   -----     -----
     7     0      h10.163.23.98.static.ip.windstream.net
     6     0      ip98-179-8-48.om.om.cox.net
     4     0      ip98-168-193-160.om.om.cox.net
0
 

Author Comment

by:fac66
ID: 35044850
Actually it does work:

=============================================================================
  Hits   %-age    Resource
 -----   -----    -----
     7   0.69   h10.163.23.98.static.ip.windstream.net
     6   0.59   ip98-179-8-48.om.om.cox.net
     4   0.40   ip98-168-193-160.om.om.cox.net
     3   0.30   ip68-110-22-151.om.om.cox.net

I had to chage the printf to reflect floating point.

printf "%6d %4.2f %5s\n", $Hits{ $key }, $perc, $key;

Open in new window

0
 

Author Comment

by:fac66
ID: 35044868

One other question please..

Is there a way I can get the same results but sorting alphabetically?
I would have to sort by the value rather than key.
What I have tried results in losing the hits count.
# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {     
        my $perc = $Hits{$key}/$ttl;     
        printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;    
}

Open in new window

0
 
LVL 85

Expert Comment

by:ozo
ID: 35045231
foreach my $key ( sort keys %Hits ) {
0

Featured Post

Hire Technology Freelancers with Gigs

Work with freelancers specializing in everything from database administration to programming, who have proven themselves as experts in their field. Hire the best, collaborate easily, pay securely, and get projects done right.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Learn how to match and substitute tagged data using PHP regular expressions. Demonstrated on Windows 7, but also applies to other operating systems. Demonstrated technique applies to PHP (all versions) and Firefox, but very similar techniques will w…
Six Sigma Control Plans

971 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question