SheldonC
asked on
reading apaache logs
I am working on my computer project, I am not sure of the forum's position on aiding students with work. Its just that I am stuck on this question for a long time now and need to move forward since the deadline is approaching. I will upload the script I have thus far and the question I am trying to solve. If anyone can point me in the right direction I will be grateful.
Thanks
This is the question I am stuck at
4. Search Logs for mod_security-message which is access denied by mod_security
When Mod_Security identifies a problem with a request due to a security violation, it will do two things – 1) Add in some additional client request headers stating why mod_security is taking action, and 2) Log this data to the audit_log and error_log files. These error messages can be triggered by Mod_Security special checks such as the SecFilterCheckURLEncoding directive, basic filters such as “\.\.” to prevent directory traversals and advanced filters based on converted snort rules.
Search Logic: Search the audit_log entries that have the mod_security-message header, then sort the results, then only show unique entries with a total count of each type in reverse order from highest to lowest, then remove the mod_security-message data at the beginning of each line and list the Top 10 results.
Your output will be similar to:
1 51746 Pattern match "Basic" at HEADER.
2 6138 Pattern match "passwd\=" at THE_REQUEST.
3 5852 Pattern match "/search" at THE_REQUEST.
4 5368 Pattern match "passwd=" at THE_REQUEST.
5 4826 Pattern match "\.asp" at THE_REQUEST.
6 3694 Pattern match "login.icq.com" at THE_REQUEST.
7 1971 mod_security-message: Invalid character detected
8 1935 Pattern match "/smartsearch\.cgi" at THE_REQUEST.
9 1887 Pattern match "cmd\.exe" at THE_REQUEST.
10 1387 Pattern match "/sh" at THE_REQUEST.
Thanks
#/usr/bin/perl
use File::Basename;
#------------------------------------------------------------------------------#
# Global variables that control the program action and output. #
#------------------------------------------------------------------------------#
$NUM_RECS_TO_PRINT = 10; # num of output recs to print per section
#---------------------------------------------------------------------#
# Change this array to include index filenames used on your system. #
#---------------------------------------------------------------------#
@indexFilenames = ('index.htm', 'index.html', 'index.shtml');
#----------------------------------------------------------------------#
# don't change anything below here unless you're comfortable with Perl #
#----------------------------------------------------------------------#
sub usage {
print STDERR "\n\tUsage: log2.pl access_log > output_file\n";
}
#----------------------------------------------------------#
# These are two helper routines for the 'sort' function. #
#----------------------------------------------------------#
sub fileNumericAscending {
$numFileRequests{$a} <=> $numFileRequests{$b};
}
sub fileNumericDescending {
$numFileRequests{$b} <=> $numFileRequests{$a};
}
sub trim($)
{
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
#----------------------------<< main >>-----------------------------#
#--------------------------------------------------------------------#
# Start by making sure the user is invoking this program properly. #
#--------------------------------------------------------------------#
$numArgs = $#ARGV + 1;
if ($numArgs != 1) {
&usage;
exit 1;
}
$logFile = $ARGV[0];
open (LOGFILE,"access_log") || die " Error opening log file $logFile.\n";
#------------------------------------------------------------------#
# Start reading and processing the access_log file in this loop. #
#------------------------------------------------------------------#
#printf "<pre>\n";
while(<LOGFILE>)
{
if (/^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/)
{
$REMOTE_IP{$1}++
}
#if (/\b[^(\s)]*)$|([^(]+?)\s*(\(.*\)/)
#(/([^(]+?)\s*(\(.*\)|\b[^(\s)]*)$/)
#(/(\b[^(\s)]*)$|([^(]+?)\s*(\(.*\))/)
#{
#$USER_AGENT{$1}++
#}
chomp;
#----------------------------------------------#
# condense one or more whitespace character #
# to one single space #
#----------------------------------------------#
s/\s+/ /go;
#----------------------------------------------------------#
# the next line breaks each line of the access_log into #
# nine variables #
#----------------------------------------------------------#
($clientAddress, $rfc1413, $username,
$localTime, $httpRequest, $statusCode,
$bytesSentToClient, $referer, $clientSoftware) =
/^(\S+) (\S+) (\S+) \[(.+)\] \"(.+)\" (\S+) (\S+) \"(.*)\" \"(.*)\"/o;
#--------------------------------------------------------------------#
# take care of problem where the $httpRequest may simply be a hyphen #
#--------------------------------------------------------------------#
next if ($httpRequest =~ '^-$');
#-----------------------------------------#
# Determine the value of $fileRequested #
#-----------------------------------------#
($getPost, $fileRequested, $junk) = split(' ', $httpRequest, 6);
($getPost, $clientAddress, $junk) = split(' ', $clientAddress, 1);
#-----------------------------------------------------------------#
# if the base filename is something like index.htm, index.html, #
# or index.shtml, interpret this to be the same as the path by #
# itself. This way, '/java/' is the same as '/java/index.html'. #
#-----------------------------------------------------------------#
foreach $indexFile (@indexFilenames) {
chomp($fileRequested);
$fileRequested = trim($fileRequested);
if ($fileRequested =~ /^\s+$/) {
next;
}
if ($fileRequested =~ /^$/) {
next;
}
if (basename($fileRequested) =~ /$indexFile/i) {
$fileRequested = dirname($fileRequested);
last;
}
}
#----------------------------------------------------------------#
# If the last character in $fileRequested is a '/', remove it. #
# This makes /perl/ equal to /perl. #
#----------------------------------------------------------------#
if (length($fileRequested) > 1)
{
if (substr($fileRequested,length($fileRequested)-1,1) eq '/')
{
chop($fileRequested);
}
}
#-----------------------------------------------------#
# here's where we count the number of hits per file #
#-----------------------------------------------------#
$numFileRequests{$fileRequested}++;
}#end first while loop
close (LOGFILE);
#--------------------------------------#
# Output the number IPs #
#--------------------------------------#
print "TOP $NUM_RECS_TO_PRINT IP ADDRESSES:\n";
print "-----------------------------\n\n";
$count=1;
foreach my $ip (sort {$REMOTE_IP{$b} <=> $REMOTE_IP{$a}} (keys(%REMOTE_IP))) {
last if ($count > $NUM_RECS_TO_PRINT);
print "$count\t$ip = $REMOTE_IP{$ip} \n";
$count++;
}
print "\n\n";
printf "</pre>\n";
#--------------------------------------#
# Output the number IPs #
#--------------------------------------#
print "TOP $NUM_RECS_TO_PRINT USER AGENTS:\n";
print "-----------------------------\n\n";
$count=1;
foreach my $agent (sort {$USER_AGENT{$b} <=> $USER_AGENT{$a}} (keys(%USER_AGENT))) {
last if ($count > $NUM_RECS_TO_PRINT);
print "$count\t$agent= $USER_AGENT{$agent} \n";
$count++;
}
print "\n\n";
printf "</pre>\n";
#--------------------------------------#
# Output the number of hits per file #
#--------------------------------------#
print "TOP $NUM_RECS_TO_PRINT CONNECT REQUESTS:\n";
print "-----------------------------\n\n";
$count=1;
foreach $key (sort fileNumericDescending (keys(%numFileRequests))) {
last if ($count > $NUM_RECS_TO_PRINT);
print "$count\t$numFileRequests{$key},$httpRequest{$key} \t\t $key\n";
$count++;
}
print "\n\n";
printf "</pre>\n";
open (LOGFILE,"audit_log") || die " Error opening log file $logFile.\n";
#printf "<pre>\n";
while (<LOGFILE>) {
if (/mod_security-message:.*\./)
{
$MOD_SEC{$1}++
}
}
close (LOGFILE);
#--------------------------------------#
# Output the number of hits per file #
#--------------------------------------#
print "TOP $NUM_RECS_TO_PRINT PATTERN MATCH:\n";
print "-----------------------------\n\n";
$count=1;
foreach my $modsec (sort {$MOD_SEC{$b} <=> $MOD_SEC{$a}} (keys(%MOD_SEC))) {
last if ($count > $NUM_RECS_TO_PRINT);
print "$count\t$agent= $MOD_SEC{$modsec} \n";
$count++;
}
print "\n\n";
printf "</pre>\n";
This is the question I am stuck at
4. Search Logs for mod_security-message which is access denied by mod_security
When Mod_Security identifies a problem with a request due to a security violation, it will do two things – 1) Add in some additional client request headers stating why mod_security is taking action, and 2) Log this data to the audit_log and error_log files. These error messages can be triggered by Mod_Security special checks such as the SecFilterCheckURLEncoding directive, basic filters such as “\.\.” to prevent directory traversals and advanced filters based on converted snort rules.
Search Logic: Search the audit_log entries that have the mod_security-message header, then sort the results, then only show unique entries with a total count of each type in reverse order from highest to lowest, then remove the mod_security-message data at the beginning of each line and list the Top 10 results.
Your output will be similar to:
1 51746 Pattern match "Basic" at HEADER.
2 6138 Pattern match "passwd\=" at THE_REQUEST.
3 5852 Pattern match "/search" at THE_REQUEST.
4 5368 Pattern match "passwd=" at THE_REQUEST.
5 4826 Pattern match "\.asp" at THE_REQUEST.
6 3694 Pattern match "login.icq.com" at THE_REQUEST.
7 1971 mod_security-message: Invalid character detected
8 1935 Pattern match "/smartsearch\.cgi" at THE_REQUEST.
9 1887 Pattern match "cmd\.exe" at THE_REQUEST.
10 1387 Pattern match "/sh" at THE_REQUEST.
ASKER
thanks for the instructions but the main part I am stuck on is this. Everything else works except this.
It doesn't output the mod_security-message header
It doesn't output the mod_security-message header
open (LOGFILE,"audit_log") || die " Error opening log file $logFile.\n";
#printf "<pre>\n";
while (<LOGFILE>) {
if (/mod_security-message:.*\./)
{
$MOD_SEC{$1}++
}
}
close (LOGFILE);
#--------------------------------------#
# Output the number of hits per file #
#--------------------------------------#
print "TOP $NUM_RECS_TO_PRINT PATTERN MATCH:\n";
print "-----------------------------\n\n";
$count=1;
foreach my $modsec (sort {$MOD_SEC{$b} <=> $MOD_SEC{$a}} (keys(%MOD_SEC))) {
last if ($count > $NUM_RECS_TO_PRINT);
print "$count\t$agent= $MOD_SEC{$modsec} \n";
$count++;
}
print "\n\n";
printf "</pre>\n";
What kind on line are you trying to match?
Try:
/mod_security-message[:].* \.
Try:
/mod_security-message[:].*
The following statement needs changed to
if (/mod_security-message:.*\./)
if (/(mod_security-message:.*\.)/)
or better yetif (/mod_security-message:(.*)\./)
The parentheses tell perl to put the value between them into $1.
Correctly stated by schubach
But you still need [:] instead of :
But you still need [:] instead of :
ASKER
works great only I don't necessarily need the "Access denied with code 200."
also I have this regex (/([^(]+?)\s*(\(.*\)|\b[^( \s)]*)$/) to extract the USER AGENT
eg. Mozilla/4.0(compatible;MSI E 6.0: Windows NT 5.1)
However when I run my script it takes a very long time to complete when I include this part of this regex in my code
thanks again guys for your help
also I have this regex (/([^(]+?)\s*(\(.*\)|\b[^(
eg. Mozilla/4.0(compatible;MSI
However when I run my script it takes a very long time to complete when I include this part of this regex in my code
thanks again guys for your help
Try:
/([^(]+)\s*([^;]*);([^:]*) \W*([^)]*) /
If it is not what you want, please tell me what you need to extract
/([^(]+)\s*([^;]*);([^:]*)
If it is not what you want, please tell me what you need to extract
ASKER
This what I am looking for
example:
Mozilla/4.0(compatible;MSI E 6.0: Windows NT 5.1)
example:
Mozilla/4.0(compatible;MSI
This is the text to be parsed?
What do you want to extract from it?
What do you want to extract from it?
ASKER
I want to extract the following string from apache access_log
What is user’s browser type? Ex: Mozilla/4.0(compatible;MSI E 6.0: Windows NT 5.1)
What is user’s browser type? Ex: Mozilla/4.0(compatible;MSI
Ok, I was thinking it to be the starting point. But I need a sample Apache Access_log
If you have access to POSIX::Regex CPAN module (since you're a student I'm not sure if you have permission to install specific CPAN modules on your box), then try the example POSIX regex found here. http://www.texsoft.it/index.php?m=sw.php.useragent. You can try from the command line
perl -e 'use POSIX::Regex';
to see if it is installed. Google is my friend.
ASKER
The regex you gave me /([^(]+)\s*([^;]*);([^:]*) \W*([^)]*) / extracts
221.233.65.147 - - [13/Mar/2004:10:13:44 -0500] "CONNECT register.livesupportonthen et.com:443 HTTP/1.0" 200 - "-" "Mozilla/4.0 = 20
The original regex that I have is if (/^(\d{1,3}\.\d{1,3}\.\d{1 ,3}\.\d{1, 3})/) and extracts
compatible; MSIE 6.0; Windows NT 5.1) Opera 7.21 [
The output is similar to what I am looking for but it takes forever to when I run it
I uploaded a sample access_log.
sample-access-log
221.233.65.147 - - [13/Mar/2004:10:13:44 -0500] "CONNECT register.livesupportonthen
The original regex that I have is if (/^(\d{1,3}\.\d{1,3}\.\d{1
compatible; MSIE 6.0; Windows NT 5.1) Opera 7.21 [
The output is similar to what I am looking for but it takes forever to when I run it
I uploaded a sample access_log.
sample-access-log
Please try this:
#!/usr/bin/perl
use strict;
open FI, "log.txt";
while (<FI>) {
chomp;
if (/["].*["].*["].*["].*["](.*)["]$/)
{
print "$1\n";
}
}
Your example log file gives this as output from my code:Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 5.5; Windows 98)
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)
ASKER
Thanks. That worked great as well. regular expressions can be somewhat challenging.
Thus one seems a bit tricky, I have to extract from the audit log brute foce attacks examplle:
attacker (24.168.72.174) was trying to login using username: exodus, password: HELL
username: exodus9971, password: christ
this is a sample of the audit_log
========================== ========== ====
Request: 24.168.72.174 - - [Tue Mar 9 22:27:46 2004] "GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0" 200 566
Handler: proxy-server
Error: mod_security: pausing [http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL] for 50000 ms
-------------------------- ---------- ----
GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0
Accept: */*
Accept-Language: en
Connection: Keep-Alive
mod_security-message: Access denied with code 200. Pattern match "passwd=" at THE_REQUEST.
mod_security-action: 200
HTTP/1.0 200 OK
Connection: close
I tried the following regex but it only returned 1 = 3643818
if (/(\|||system\(|eval\(|`|\ \)/i)
Thus one seems a bit tricky, I have to extract from the audit log brute foce attacks examplle:
attacker (24.168.72.174) was trying to login using username: exodus, password: HELL
username: exodus9971, password: christ
this is a sample of the audit_log
==========================
Request: 24.168.72.174 - - [Tue Mar 9 22:27:46 2004] "GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0" 200 566
Handler: proxy-server
Error: mod_security: pausing [http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL] for 50000 ms
--------------------------
GET http://sbc2.login.dcn.yahoo.com/config/login?.redir_from=PROFILES?&.tries=1&.src=jpg&.last=&promo=&.intl=us&.bypass=&.partner=&.chkP=Y&.done=http://jpager.yahoo.com/jpager/pager2.shtml&login=exodusc&passwd=HELL HTTP/1.0
Accept: */*
Accept-Language: en
Connection: Keep-Alive
mod_security-message: Access denied with code 200. Pattern match "passwd=" at THE_REQUEST.
mod_security-action: 200
HTTP/1.0 200 OK
Connection: close
I tried the following regex but it only returned 1 = 3643818
if (/(\|||system\(|eval\(|`|\
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Ok. I will open a new post with a more detailed explanation.
ASKER
excellent feedback
use warnings;
use strict;
This is help you troubleshoot and force good practices.
Line 39 looks wrong
sub trim($)
It should not have ($)
It is fine to use File::Basename but not needed. You can use a simple command like
$filename =~ s/\..*//;