fac66
asked on
Perl hash help with parsing apache logs
Hello,
I am reading an apache file and parsing the data.
I need to display number of accesses per hostname, number of accesses, and
a percentage of the total accesses that each host accounted for as follows.
I am having a problem calculating the percentage ot total access.
Thanks in advance
Hits %-age Resource
----- ----- -----
7 1 h10.163.23.98.static.ip.wi ndstream.n et
6 1 ip98-179-8-48.om.om.cox.ne t
4 1 ip98-168-193-160.om.om.cox .net
3 1 ip68-110-22-151.om.om.cox. net
I am reading an apache file and parsing the data.
I need to display number of accesses per hostname, number of accesses, and
a percentage of the total accesses that each host accounted for as follows.
I am having a problem calculating the percentage ot total access.
Thanks in advance
Hits %-age Resource
----- ----- -----
7 1 h10.163.23.98.static.ip.wi
6 1 ip98-179-8-48.om.om.cox.ne
4 1 ip98-168-193-160.om.om.cox
3 1 ip68-110-22-151.om.om.cox.
#reading in file
my ($file) = @ARGV;
open (LOG, $file);
my ($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent);
#hash for hits
my %Hits;
while ( my $line=<LOG>) {
($host,$date,$method,$urls,$httpver,$status,$size,$referrer,$agent) = $line =~
m/^(\S+) - - \[(\S+ [\-|\+]\d{4})\] "(\S+) (\S+) ([^"]+)" (\d{3}) (\d+|-) "(.*?)" "([^"]+)"$/;
#Counting number of hits per host, &Hnames is a subroutine that calls $host and does a reverse dns lookup
$Hits{&Hnames}++
}
#------------------------------
print "=" x 78,"\n";
print "HOSTNAMES\n";
print "=" x 78,"\n";
printf "%6s %4s %s\n", "Hits", "%-age", "Recourse";
printf "%6s %4s %s\n", "-----", "-----","-----";
# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
my $num += $Hits{$key};
my $perc = $Hits{$key}/$num;
printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
}
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Actually it does work:
========================== ========== ========== ========== ========== ========== =
Hits %-age Resource
----- ----- -----
7 0.69 h10.163.23.98.static.ip.wi ndstream.n et
6 0.59 ip98-179-8-48.om.om.cox.ne t
4 0.40 ip98-168-193-160.om.om.cox .net
3 0.30 ip68-110-22-151.om.om.cox. net
I had to chage the printf to reflect floating point.
==========================
Hits %-age Resource
----- ----- -----
7 0.69 h10.163.23.98.static.ip.wi
6 0.59 ip98-179-8-48.om.om.cox.ne
4 0.40 ip98-168-193-160.om.om.cox
3 0.30 ip68-110-22-151.om.om.cox.
I had to chage the printf to reflect floating point.
printf "%6d %4.2f %5s\n", $Hits{ $key }, $perc, $key;
ASKER
One other question please..
Is there a way I can get the same results but sorting alphabetically?
I would have to sort by the value rather than key.
What I have tried results in losing the hits count.
# Sorting on hits high -> low
foreach my $key ( sort { $Hits{ $b } <=> $Hits{ $a } } (keys %Hits) ) {
my $perc = $Hits{$key}/$ttl;
printf "%6d %4d %5s\n", $Hits{ $key }, $perc, $key;
}
foreach my $key ( sort keys %Hits ) {
ASKER
This is what i get..
Hits %-age Resource
----- ----- -----
7 0 h10.163.23.98.static.ip.wi
6 0 ip98-179-8-48.om.om.cox.ne
4 0 ip98-168-193-160.om.om.cox