We help IT Professionals succeed at work.

Parse the Tomcat error log to get a range of time period of the repeated error

1,765 Views
Last Modified: 2012-08-13
Please reference my original question below:
https://www.experts-exchange.com/Programming/Languages/Scripting/Perl/Q_23748402.html

I need to parse the original tomcat error log and generate a nice report for repeated error messages.

   Originally, I use "grep -A1 ERROR <tomcat.error.log>" to grep the line contain keyword ERROR and the following line.
But some tomcat error log file have two continuous lines contain keyword  "ERROR" such as
---------------
2008-08-08 16:32:22,649 [ERROR] http-8443-Processor24 -- Not flushing: pub has not been authenticated
2008-08-08 18:36:25,847 [ERROR] http-8443-Processor24 -- InputParser error: ServiceException:<status><code>5</code><msg>Servers are busy.  Please resend the content</msg></status>
2008-08-08 18:36:25,847 [ERROR] http-8443-Processor24 -- [REQ_264]ServiceException:<status><code>5</code><msg>Servers are busy.  Please resend the content</msg></status>
        at com.opinmind.inputservice.InputParser.endElement(InputParser.java:283)
        at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:633)
        at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanEndElement(XMLDocumentFragmentScannerImpl.java:1241)
-------------------
   So I need this script can do
1. Parse the original tomcat.error.log and grep the line with keyword "ERROR" and its following line if the following line doesn't not contain the keyword "ERROR".
2. Those error messages with keyword "ERROR" is in the format like "<sentence1> -- <sentence2>". <sentence2> is the main error message I need and it could be repeated in the log file. I need to parse this log to the following format:
=============================
First happen : 2008-08-08 16:32:22
Last Error message:
2008-08-08 19:32:22,649 [ERROR] http-8443-Processor24 -- Not flushing: pub has not been authenticated
Count (20)
-----------------------------
First happen: 2008-08-08 16:33:22
Last Error message:
2008-08-08 22:36:25,847 [ERROR] http-8443-Processor24 -- [REQ_264]ServiceException:<status><code>5</code><msg>Servers are busy.  Please resend the content</msg></status>
        at com.opinmind.inputservice.InputParser.endElement(InputParser.java:283)
Count (21)
-----------------------------
...
========================

Basically, this script can report the following information for each unique error message
1. First happen time in the log
2. The last happen time in the log with the line contain ERROR and the following line if the following line doesn't contain keyword "ERROR"
3. The count of the same type of error message
4. each type of error message is separated by "--------------------" or "==================" for easy reading.

Original script to parse tomcat.error.log.1
grep -A1 ERROR tomcat.error.log.1 | script.pl
-----< script.pl >---------
#!/usr/bin/perl
use strict;
use warnings;
 
my %ErrorString;
my %ErrorCount;
local $/ = "--\n";
while(<>) {
        s/-*$//;
 
        next unless /(.*?)\s+--\s+(.*)[\r\n]+(.*)/m;
 
        $ErrorCount{$2}++;
        $ErrorString{$2} = "$1 -- $2\n$3"  # store the last repeated entry
#        $ErrorString{$2} = "$1 -- $2\n$3" unless $ErrorString{$2}; # store the first repeated entry
}
 
foreach (keys %ErrorString) {
        print "$ErrorString{$_} -- Count ($ErrorCount{$_})\n";
        print "------------------------------\n";
}

Open in new window

Comment
Watch Question

Top Expert 2009

Commented:
Attach the error log.
Top Expert 2011

Author

Commented:
The tomcat log is attached in gzip format.
Top Expert 2011

Author

Commented:
The tomcat log is attached in zip format
tomcat.error.zip

Commented:
Instead of using grep I would pipe(or actually open the file in perl) the whole logfile to perl and process it line by line.
Algorithm would be simple:

if grep (/ERROR/, $line){
push(@errors, $line);
}
else if (grep (/DEBUG/,$line ) or grep (/DEBUG/,$line) or   ) {
 next; #previous error had only one line or there was no error in previous line.
}
else{
push(@erros, $line); #append rest of previous error
}
Top Expert 2011

Author

Commented:
Great. Could you put it into the whole perl script?

Commented:
Here you go:
#!/bin/perl
 
use strict;
use warnings;
 
if (!defined($ARGV[0])){
  print STDERR "Usage: processLog.pl log_file_name\n";
  exit(1);
}
 
 
my @errors = ();
 
open IH ,"<$ARGV[0]" or die "Cannot open file";
my @lines = <IH>;
 
 
foreach my $line (@lines){
 
  if ($line =~/ERROR|FATAL/){ #this is error
   push(@errors,$line);
  }
  if ($line =~/WARN|DEBUG|INFO/){ #this will be ignored
    next;
  }else{   #this means continuation of previous error
    push(@errors, $line);
  }
}
print @errors;

Open in new window

Commented:
put close IH; in line 16. I forgot about it ;-)
Top Expert 2011

Author

Commented:
Thanks for the script. However, it shows more than one line after the line contains ERROR.

I want to grep only one line after the line containing ERROR if the next line doesn't contain ERROR.

Commented:
This should do what you want:
#!/bin/perl
 
use strict;
use warnings;
 
if (!defined($ARGV[0])){
  print STDERR "Usage: processLog.pl log_file_name\n";
  exit(1);
}
 
 
my @errors = ();
 
open IH ,"<$ARGV[0]" or die "Cannot open file";
my @lines = <IH>;
close IH;
 
my $errFound = 0;
foreach my $line (@lines){
 
  if ($line =~/ERROR|FATAL/){ #this is error
   push(@errors,$line);
   $errFound = 1;
   next;
  }
  if ($line =~/WARN|DEBUG|INFO/){ #this will be ignored
    $errFound = 0;
    next;
  }else{   #this means continuation of previous erro
    if ($errFound){
      push(@errors, $line);
      $errFound = 0;
    }
  }
}
print @errors;

Open in new window

Top Expert 2009
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Top Expert 2011

Author

Commented:
Great.

Adam314,

For part 2, your scripts doesn't print the following line to the Last Error message when there are two lines result for the one error message.

For example:
# script1.pl  tomcat.error.log
-----------
...
2008-09-10 18:09:26,913 [FATAL] http-8080-Processor19 -- org.xml.sax.SAXParseException: Premature end of file.
        at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
...
--------------

# # script1.pl  tomcat.error.log | script2.pl
--------------------
First happen: 2008-08-08 16:32:22
Last Error message:
2008-09-10 18:09:26,913 [FATAL] http-8080-Processor19 -- org.xml.sax.SAXParseException: Premature end of file.
Count (36)
--------------------
where
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:236)
is missing after script2.pl

By the way, I changed line 8 and line 29 to /ERROR|FATAL/ to get the FATAL error message.
Top Expert 2009

Commented:

my (%first, %last, %count, $lasterr);
while(<>) {
    if(/ERROR/ and /([\d\- :]+)(.*?)\s*--\s*(.*)/) {
        $lasterr=$3;
        $first{$3} = $1 unless $first{$3};
        $last{$3} = "$1$2";
        $count{$3}++;
    }
    else {
        $last{$lasterr} .= $_;
    }
}
 
...

Open in new window

Top Expert 2011

Author

Commented:
Thanks Adam314. I found the $last{$_} display order is not right.
-------------------
First happen: 2008-09-10 18:09:26
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19   at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1269)
 -- [REQ_13669]org.xml.sax.SAXParseException: Premature end of file.
Count (1)
-------------------

It should looks like
--------------------
First happen: 2008-09-10 18:09:26
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- [REQ_13669]org.xml.sax.SAXParseException: Premature end of file.
  at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1269)
 
Count (1)
-------------------
Top Expert 2009

Commented:
Updated script 2...
#!/usr/bin/perl
use strict;
use warnings;
 
my (%first, %last, %count, $lasterr);
my $fullerror;
 
while(<>) {
	if(/ERROR/) {
		ProcessFullError() if $fullerror;
		$fullerror = $_;
	}
	else {
		$fullerror .= $_;
	}
}
ProcessFullError();
 
sub ProcessFullError {
	die "Unknown line\n" unless $fullerror =~ /([\d\-\ :]+)(.*?)\s*--\s*(.*)$/s;
	$first{$3} = $1 unless $first{$3};
	$last{$3} = "$1$2";
	$count{$3}++;
}
 
foreach (sort keys %first) {
        print "First happen: $first{$_}\n";
        print "Last Error message:\n$last{$_} -- $_\n";
        print "Count ($count{$_})\n";
        print (("-"x20)."\n");
}

Open in new window

Top Expert 2011

Author

Commented:
Hi Adam314,

Thank you again.

I test on the other tomcat log file and and it seems not showing the same error once for repeated error.

I have quick question about your second script.
1. How to I modify your part 2 script to show two following lines after the found the line with keyword ERROR or FATAL
2. How to I sort the report by Count number? So biggest count number error messages will display on top.

The attached file is another tomcat error log file which the repeated error message on $3 portion are count separately.
Top Expert 2011

Author

Commented:
another tomcat error log file
tomcat.error.zip
Top Expert 2009

Commented:
1.  If the line contains ERROR, it is assumed to be the start of a new error.  The previous error line (which may be several lines) is processed, and this line is saved as the error line.  If the line does not contain ERROR, this line is assumed to be a continuation of the previous error line.

2. Change this on line 26:
    foreach (sort keys %first) {
To this:
    foreach (sort {$Count{$a} <=> $Count{$b}} keys %first) {


>>I test on the other tomcat log file and and it seems not showing the same error once for repeated error.
I'm not sure what you mean by this.  Is the output not correct?  If not, provide details about what it should be.
Top Expert 2011

Author

Commented:
Thanks for the explanation.
1. After change to  foreach (sort {$Count{$a} <=> $Count{$b}} keys %first) {
I got:
-------------
Global symbol "%Count" requires explicit package name at ./ParseErrLog.pl line 26.
Global symbol "%Count" requires explicit package name at ./ParseErrLog.pl line 26.
-----------

2. For tomcat.error.zip,
# script1.pl  tomcat.error.log.1 | script2.pl
==============
First happen: 2008-09-21 05:20:11,486
Last Error message:
2008-09-21 05:20:11,486 [ERROR] http-8080-Processor25 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245003&sli=no_code_specified&pr=0.0000

Count (1)
--------------------
First happen: 2008-09-21 05:20:11,218
Last Error message:
2008-09-21 05:20:11,218 [ERROR] http-8080-Processor13 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245009&sli=no_code_specified&pr=0.0000

Count (1)
--------------------
First happen: 2008-09-21 05:20:10,579
Last Error message:
2008-09-21 05:20:10,579 [ERROR] http-8080-Processor13 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245013&sli=no_code_specified&pr=0.0000

Count (1)
--------------------
First happen: 2008-09-21 05:20:09,225
Last Error message:
2008-09-21 05:20:09,225 [ERROR] http-8080-Processor13 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245017&sli=no_code_specified&pr=0.0000

Count (1)
--------------------
First happen: 2008-09-21 05:20:31,741
Last Error message:
2008-09-21 05:20:31,741 [ERROR] http-8080-Processor25 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245024&sli=no_code_specified&pr=0.0000

Count (1)
--------------------
First happen: 2008-09-21 05:20:27,952
Last Error message:
2008-09-21 05:20:27,952 [ERROR] http-8080-Processor15 -- Error in serving ImpressionTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=1245026&sli=no_code_specified&pr=0.0000
================

Those error meessage is the same type -- "Error in serving ImpressionTracking:"
Top Expert 2011

Author

Commented:
I mean the second tomcat.error.zip  file I attached. Not the first one.
Also
=================
First happen: 2008-09-20 23:27:11
Last Error message:
2008-09-21 04:34:06,301 [ERROR] http-8080-Processor15 -- fromJSON()
java.lang.Exception: invalid OMCookie: #tokens != #tags: {"i":["183c"],"u":["MSN_b4a21894ded8d55a11798e4f8a0f1389","MSN_b4a21894ded8d55a11798e4f8a0f1389"],"t":["c1c"]}

Count (16)
--------------------
First happen: 2008-09-20 23:00:07
Last Error message:
2008-09-20 23:00:07,701 [ERROR] http-8080-Processor25 -- fromJSON()
java.lang.Exception: invalid OMCookie: #tokens != #tags: {"i":[],"u":["AIM_cbfa1d3f8c5c85ea3dd783908e926df9"],"t":[]}

Count (1)
--------------------
First happen: 2008-09-20 23:31:38
Last Error message:
2008-09-21 07:00:11,733 [ERROR] http-8080-Processor21 -- fromJSON()
org.json.JSONException: Unterminated string at character 220 of {"i":["411c","422c","439c","407c"],"u":["82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d"],"t":["451f8d","454067","4545ec","45431

Count (15)
====================
Those three have the same error message -- "fromJSON()".
It seems the error message end with special charactor ":" or ")" have some problem.
Top Expert 2011

Author

Commented:
I'm thinking the modify the script1.pl to get the output like
# script1.pl tomcat.error.log
==========
-----
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- Not flushing: pub has not been authenticated
-----
2008-09-11 03:18:16,201 [ERROR] http-8080-Processor2 -- Exception in redirect logic:
java.lang.NullPointerException
-----
2008-09-11 03:24:29,323 [ERROR] http-8080-Processor44 -- Exception in redirect logic:
java.lang.NullPointerException
-----
2008-09-21 06:54:11,485 [ERROR] http-8080-Processor13 -- fromJSON()
org.json.JSONException: Unterminated string at character 220 of {"i":["411c","422c","439c","407c"],"u":["82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d","82d2dab3d119cd31cd3cffd02dffd48d"],"t":["451f8d","454067","4545ec","45431
-----
2008-09-21 06:55:25,239 [ERROR] http-8080-Processor12 -- Servlet.service() for servlet PixelServlet threw exception
java.lang.StackOverflowError
-----
2008-09-21 06:55:50,475 [ERROR] http-8080-Processor15 -- Error in serving ClickTracking:
java.lang.Exception: invalid line item: nt=1&li=0&cr=0
-----
2008-09-21 06:56:55,101 [ERROR] http-8080-Processor13 -- Exception in pixel logic:
java.lang.Exception: Token String not specified: t=s&p=0&ct=1: 8.19.23.5
==========
So it is easy to separate the error message.
---script1.pl ---
#!/usr/bin/perl
use strict;
use warnings;
 
my $err=0;
while(<>) {
	if(/ERROR/) {
		print "-----\n";
		print $_;
		$err=1;
	}
	elsif($err) {
		print $_;
		$err=0;
	}
}

Open in new window

Top Expert 2011

Author

Commented:
> 1. How to I modify your part 2 script to show two following lines after the found the line with keyword ERROR or FATAL
I think the question should be how to modify both script1 and script2 (majorly scrip1) to print out 2 lines after keyword ERROR or FATAL.
Top Expert 2011

Author

Commented:
>    foreach (sort {$Count{$a} <=> $Count{$b}} keys %first) {
I fixed the case issue with
foreach (sort {$count{$a} <=> $count{$b}} keys %first) {

However, it sort ascend. How to sort descend.
Top Expert 2009

Commented:
To sort descend:
    foreach (sort {$count{$b} <=> $count{$a}} keys %first) {

For the error message - it currently uses both lines to determine if 2 errors are the same, not just the first line.  Should it only use the first line?
Top Expert 2011

Author

Commented:
Yes. Please use the the first line.
Top Expert 2009

Commented:
Updated script 2
#!/usr/bin/perl
use strict;
use warnings;
 
my (%first, %last, %count, $lasterr);
my @fullerror;
 
while(<>) {
	if(/ERROR/) {
		ProcessFullError() if @fullerror;
		@fullerror = ($_);
	}
	else {
		push @fullerror, $_;
	}
}
ProcessFullError();
 
sub ProcessFullError {
	die "Unknown line\n" unless $fullerror[0] =~ /([\d\-\ :]+)(.*?)\s*--\s*(.*)$/;
	$first{$3} = $1 unless $first{$3};
	$count{$3}++;
	$last{$3} = "$1$2 -- $3\n" . join('', @fullerror[1..$#fullerror]);
}
 
foreach (sort keys %first) {
        print "First happen: $first{$_}\n";
        print "Last Error message:\n$last{$_}";
        print "Count ($count{$_})\n";
        print (("-"x20)."\n");
}

Open in new window

Top Expert 2011

Author

Commented:
Thanks you very much.

One more condition:
==============
First happen: 2008-09-10 18:09:26
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- [REQ_13669]org.xml.sax.SAXParseException: Premature end of file.
  at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1269)
Count (1)
==============

Could we strip out [REQ_13669] or [REQ_.*] when compare the error message? Because the rest of error messages are the same.

Thanks again.
Top Expert 2009

Commented:
sub ProcessFullError {
        $fullerror[0] =~ s/\[REQ_\d+\]//;
        die "Unknown line\n" unless $fullerror[0] =~ /([\d\-\ :]+)(.*?)\s*--\s*(.*)$/;
        ...
}
Top Expert 2011

Author

Commented:
Thanks.

I found after insert "$fullerror[0] =~ s/\[REQ_\d+\]//;" and the sorting is not based on the count.
Here is the output
==============
First happen: 2008-08-19 05:42:36
Last Error message:
2008-08-19 05:42:36,433 [ERROR] http-8443-Processor24 -- Entry is not valid: <entry><uid>f65e4d6c26a233ddcfd46a32127d3522</uid><ts>1219037344000</ts><pub>200</pub><dt>8</dt></entry>
Count (1)
--------------------
First happen: 2008-08-19 21:06:32
Last Error message:
2008-09-29 19:26:49,060 [ERROR] http-8080-Processor105 -- Exception in redirect logic:
java.lang.NullPointerException
Count (281)
--------------------
First happen: 2008-08-08 18:36:25
Last Error message:
2008-08-08 18:41:56,138 [ERROR] http-8443-Processor25 -- InputParser error: ServiceException:<status><code>5</code><msg>Servers are busy.  Please resend the content</msg></status>
Count (86)
--------------------
First happen: 2008-08-08 16:32:22
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- InputParser error: org.xml.sax.SAXParseException: Premature end of file.
Count (36)
--------------------
First happen: 2008-08-08 16:32:22
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- Not flushing: pub has not been authenticated
Count (36)
--------------------
First happen: 2008-08-08 18:36:25
Last Error message:
2008-08-08 18:41:56,139 [ERROR] http-8443-Processor25 -- ServiceException:<status><code>5</code><msg>Servers are busy.  Please resend the content</msg></status>
        at com.opinmind.inputservice.InputParser.endElement(InputParser.java:283)
Count (86)
--------------------
First happen: 2008-09-05 16:52:19
Last Error message:
2008-09-05 16:52:19,874 [ERROR] http-8080-Processor73 -- java.lang.Exception: Unknown request: DOCS_DISCOVERED
        at com.opinmind.status.StatusMonitorServlet.service(StatusMonitorServlet.java:343)
Count (1)
.
.
.
First happen: 2008-09-05 16:52:23
Last Error message:
2008-09-05 16:52:23,909 [ERROR] http-8080-Processor27 -- java.lang.Exception: Unknown request: LEAD_DISCOVERED
        at com.opinmind.status.StatusMonitorServlet.service(StatusMonitorServlet.java:343)
Count (1)
--------------------
First happen: 2008-08-21 09:38:07
Last Error message:
2008-09-01 21:02:29,698 [ERROR] http-8443-Processor23 -- java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method)
Count (22)
--------------------
First happen: 2008-08-08 16:32:22
Last Error message:
2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- org.xml.sax.SAXParseException: Premature end of file.
        at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1269)
Count (36)
=================
Top Expert 2011

Author

Commented:
The sorting on count behavior changed since you change $fullerror to @fullerror (line 6, 10, 11 and 14).

Could you explain a little bit for your script?

Thanks.
Top Expert 2009
Commented:
This one is on us!
(Get your first solution completely free - no credit card required)
UNLOCK SOLUTION
Top Expert 2011

Author

Commented:
Thanks a lot.
If I want to grep two keywords, say ERROR and FATAL, can I just modify
if (/ERROR/)
to
if (/ERROR|FATAL/)
on script1 and script2?
Top Expert 2011

Author

Commented:
I modify to if (/ERROR/FATAL/) on script1, then I got 36 FATAL error messages.
But piped to script2 dosn't not show any FATAL error message.
Do I miss something.
Top Expert 2009

Commented:
Update both script1 and script2:
    if(/(ERROR|FATAL)/)
Top Expert 2011

Author

Commented:
I did. But after script2, I don't see any FATAL message.
Top Expert 2009

Commented:
The second log file you posted doesn't have any FATAL messages in it.

The first log you posted does, but for all of the FATAL messages, there is a more recent ERROR message, so only the ERROR is posted.
eg:
    line 71 has FATAL with this error:  org.xml.sax.SAXParseException: Premature end of file.
    line 10412 has ERROR with the same error message
    The First Happen comes from the FATAL on line 71: 2008-08-08 16:32:22
    The Last Error message comes from the ERROR on line 10412:
        2008-09-10 18:09:26,928 [ERROR] http-8080-Processor19 -- org.xml.sax.SAXParseException: Premature end of file.
Top Expert 2011

Author

Commented:
I got it.
Thanks a lot.
Unlock the solution to this question.
Join our community and discover your potential

Experts Exchange is the only place where you can interact directly with leading experts in the technology field. Become a member today and access the collective knowledge of thousands of technology experts.

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.