Solved

Looking for a method to take a raw log file and format it into HTML

Posted on 2003-11-06
9
491 Views
Last Modified: 2011-10-03
Hi

My question is, I would like to find a way to take a log file on a unix server and have a script that parses that log file and pipes it into HTML for easier viewing. The format that the log file takes currently looks like this:

servername MEMORY.MEMORY.MEMPageFreed ALARM:value=285 at 11/04/2003 13:17:23  --  cse1
servername MEMORY.MEMORY.MEMPageFreed WARN:value=86 at 11/04/2003 13:18:06  --  cse1
servername FILESYSTEM.root.FSCapacity ALARM:value=96 at 11/04/2003 13:18:30  --  ucse1
servername FILESYSTEM.stage.FSCapacity ALARM:value=98 at 11/04/2003 13:18:26  --  cse1

The log file entries above (4 separate ones) detail when a certain threshold is reached for a server parameter, for example when a filesystem breaches 95%. The log file is generated from an application called Patrol which monitors all systems for a multitude of paramaters and warns or alarms when they reach certain values. Not to get off the subject anymore than that, is there a way to read that file and output it to HTML on that server?

To slightly complicate things, these values are generated in real-time, could be written to the log in a matter of seconds or minutes, depending on if a value is reached. Therefore, the script would need to somehow continually read from the file or be activated when the log file is updated somehow.

Would it also be possible to parse the file in such a way that only the lines in the file where the value amount is, say, greater than 95 for example be viewable in the report? I am a real beginner with Perl programming and programming in general so any help on this would be greatly appreciated.

Thanks.

0
Comment
Question by:stevecamp
  • 3
  • 3
  • 3
9 Comments
 
LVL 2

Expert Comment

by:rootkiddy
ID: 9695895
Here's something to get you started.


# Begin Parsing code.

# Open command is for opening flat file.  You may consider getting data
# in another way such as one script to dump to a database through a cron job or daemon
# while the other script used to read in the information.
# Also note that this is a quick a dirty script that can be cleaned up a bunch.  This
# is to get some ideas going in your head.
open(INPUT, "data.txt") or die "can't open file\n";
@contents=<INPUT>;
close(INPUT);

for($i=0; $i<@contents; $i++){
   # strip out the date so we can just split on spaces.
   $contents[$i] =~ s/ at (\d+\/\d+\/\d+ \d+\:\d+\:\d+)//g;
   $date[$i] = $1;

   # split appart remaining fields and store the threshold information temporarily for future parsing.
   ($servername[$i],$alert[$i],$thresholdtemp,$whatevervalue[$i],$lastfield[$i],$junk[$i]) = split("

",$contents[$i]);

   # split apart the name = value pair.
   ($thname[$i],$thvalue[$i]) = split("=",$thresholdtemp);

   print "servername = $servername[$i]\n";
   print "alert = $alert[$i]\n";
   print "threshold = $thvalue[$i]\n";
   print "not sure = $whatevervalue[$i]\n";
   print "Last field = $lastfield[$i]\n";
}

# End parsing code.

Now that you have the value threshold value stored in the array @thname, all you'll need to do is find all

values greater than what you want.  If this is a large environment being monitored you may considering

parsing these logs every so often and dumping them to a database for querying.  If there are minimum

alerting go on then a quick and dirty method may be to have just a cgi script coded to open this log file

for dynamic reading.

For outputting to html there are modules to help with it or you can simply use print or similar call to

output the html.  Here's a snippet of code that may help you get started assuming variables where used

from above.

print "Content-type: text/html\n\n";
print "<html><head></head><body>\n";
print "<table>\n";
print "<tr><td>ServerName</td><td>Threshold Name</td><td>Threshold Value</td></tr>\n";
for ($i=0; $i<@servername; $i++) {
   print "<tr>\n";
   print "<td>$alert[$i]</td>\n";
   print "<td>$thname[$i]</td>\n";
   print "<td>$thvalue[$i]</td>\n";
   print "</tr>\n";
}
print "</table></body></html>\n";
0
 

Author Comment

by:stevecamp
ID: 9696575
Thank you. My other question would be, if in the log file there are say 1000 records. Let's also assume that about 500 of those records contain the words "itcc.itcc" in the second field. Those are the records I want to keep and discard the rest. How would I strip out all records EXCEPT those that contain "itcc.itcc" in the second field? I know how to remove lines based on them containing a certain criteria, but not how to get rid of all records that DON'T match a criteria. Can you help me with that too?

Thanks.
0
 
LVL 2

Expert Comment

by:rootkiddy
ID: 9696911
You can do this several ways.  One method is not using the $i in the for loop and test the values before assigning them to the new arrays.  For example.

my $count = 0;
for($i=0; $i<@contents; $i++){
   if ($contents[$i] !~ /^\S+\sitcc\.itcc/) {
      next;
   }
   # change the rest of the $i's to $count.
   
   #Before ending the for loop increment $count.
   $count++;
}

Now another way is to keep all of the code to parse the logs and as mentioned before maybe dumping it into a database.  Basically keeping all the logs parsed will give the flexibility of being able to report on different things other than itcc.itcc as business requirements seem to always change.  Now that you have all the information per the original code you would need to remove these lines as it was just debugging stuff.

   print "servername = $servername[$i]\n";
   print "alert = $alert[$i]\n";
   print "threshold = $thvalue[$i]\n";
   print "not sure = $whatevervalue[$i]\n";
   print "Last field = $lastfield[$i]\n";

And then in the script you would want to do something similar to this printing of html per original example.

for($i=0; $i<@servername; $i++) {
   if ($alert[$i] !~ /itcc\.itcc/) {
      next;
   }
   # Now do all your printing of html.
}
0
 
LVL 48

Expert Comment

by:Tintin
ID: 9696946
Here's a version that will continously reads the log and output itcc.itcc entries with thresholds >95%

#!/usr/bin/perl
use strict;

my $log="/path/to/patrol/log";

open LOG, "/usr/bin/tail -f $log |" or die "Could not run tail command $!\n";

$|=1;   # Turn buffering off

print "Content-Type: text/html\n\n";

print <<EOF;
<html>
<head><title>Patrol Logs</title></head>
<body>
<pre>
EOF

while (<LOG>) {
  next unless /itcc\.itcc/;
  my ($threshold) = /value=(\d+)/;
  print if ($threshold > 95);
}


 
 
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 

Author Comment

by:stevecamp
ID: 9701580
Ok it might be best if I post my script(s) and get an overall definite best method to do this. To clarify, there is one log file that is updated with events from Patrol on a continual basis. I need to do 2 main things. First, strip out all events that occurred before 11/5 (they are all dated 11/4 if that helps), strip out all records or events that contain the word OK or WARN in the 3rd field of each record and then spit that out to one log file, we'll call it Unix_Alarms.log. Then I need to do the same thing to the original log file again but I instead need to keep ALL records that contain the value "itcc.itcc" in the second field of each record and then spit that out to a second log file called "itcc_events.log". I think I can concentrate on other things like HTML viewing and granularity later. To summarise:

1) grab the data from one log file "patrol_events.log" and create 2 new log files from that data. The 2 new log files, should have only records that contain the word ALARM in the 3rd field of the record (I guess that is easier than saying strip out all instances that contain OK or WARN). There are only 3 values in this field and I only want the records that contain ALARM.

2) The second log file should only contain a subset of those records, namely only the ones with "itcc.itcc" in the second field. So "itcc_events.log" is really just a subset of "unix_alarms.log".

3) add the date in the format of 11.7.2003 onto the end of each log file for historical purposes.

4) if possible, sort the records in the 2 new log files by system or alphabetically by the first field.

My current solution contains 2 scripts to do the above. I would really like to have one script to do all of the above. Here are the 2 scripts (basically identical) only the second one tries to remove all records apart from the itcc.itcc ones (unsuccessfully up to now!)

#!/usr/bin/perl

##### Variable Declarations #####

my $input_file = '/patrol/patrol7/oxhpscripts/patrolnotifyemail_3183.log';
my $itcc_output = '/patrol/patrol7/oxhpscripts/itcc_events_alarms.log';

print "Checking log file for ITCC Alarming systems...\n\n";
sleep 2;

open (INPUT_FILE, "$input_file") ||
  die ("Cannot open input file, $input_file, for reading.\n",
       "OS_ERROR: $!\n");
open (ITCC_OUTPUT, ">$itcc_output") ||
  die ("Cannot open output file, $itcc_output, for writing. \n",
       "OS_ERROR: $!\n");

my $header = "============================= Patrol Systems in Alarm State ==========================
==\n\n\n";
my $columns = "System     Parameter                   Message             Date     Time         Grp\
n";
print TEMP_OUTPUT $header;
print TEMP_OUTPUT $columns;

while( defined( my $line = <INPUT_FILE>) ) {
  next if ($line =~ /^\s*$/); # skip blank lines.
  my ($system, $param, $msg, $at, $date, $time, $dash, $method) = split(/\s+/, $line); # split
  next if ($msg =~ /WARN/); # eliminate non-required entries
 next if ($msg =~ /OK/);
  next if ($date =~ /04/);
  next unless ($param =~ /itcc.itcc/);
  my $sep = "---------------------------------------------------------------------------------------
-\n";
  my $final_output = sprintf "%-10s %-20s %-15s %-2s %-8s %-8s %-3s %-5s %s\n", $system, $contents,
$msg, $at, $date, $time, $dash, $method;
  print ITCC_OUTPUT $sep, $final_output;
}

##### close open files #####
close INPUT_FILE;
close ITCC_OUTPUT;

sleep 2;
#print "Formatting of Patrol log file complete. Output file is $itcc_output.\n";
exit 0;

###########end of itcc_AlarmingSystems.pl###########

###########start of unix_AlarmingSystems.pl##########

#!/usr/bin/perl

##### Variable Declarations #####

my $input_file = '/patrol/patrol7/oxhpscripts/patrolnotifyemail_3183.log';
my $unix_alarms = '/patrol/patrol7/oxhpscripts/unix_thresholds.log';

print "Checking log file for Unix Alarming systems...\n\n";
sleep 2;

open (INPUT_FILE, "$input_file") ||
  die ("Cannot open input file, $input_file, for reading.\n",
       "OS_ERROR: $!\n");
open (UNIX_ALARMS, ">$unix_alarms") ||
  die ("Cannot open output file, $unix_alarms, for writing. \n",
       "OS_ERROR: $!\n");

my $header = "============================= Patrol Systems in Alarm State ==========================
==\n\n\n";
my $columns = "System     Parameter                   Message             Date     Time         Grp\
n";
print UNIX_ALARMS $header;
print UNIX_ALARMS $columns;

while( defined( my $line = <INPUT_FILE>) ) {
  next if ($line =~ /^\s*$/); # skip blank lines.
  my ($system, $param, $msg, $at, $date, $time, $dash, $method) = split(/\s+/, $line); # split
  next if ($msg =~ /WARN/); # eliminate non-required entries
  next if ($msg =~ /OK/);
  next if ($date =~ /04/);
  next if ($param =~ /itcc.itcc/);
  my $sep = "---------------------------------------------------------------------------------------
-\n";
my $final_output = sprintf "%-10s %-20s %-15s %-2s %-8s %-8s %-3s %-5s %s\n", $system, $param, $ms
g, $at, $date, $time, $dash, $method;
  print UNIX_ALARMS $sep, $final_output;
}

##### close open files #####
close INPUT_FILE;
close UNIX_ALARMS;

sleep 2;
#print "Formatting of Patrol log file complete. Output file is $unix_alarms.\n";
exit 0;
##########end of script###########

As you can see, it's a bit of a mess. I would love to combine the 2 scripts into one and just have a korn shell script call the perl script maybe from Cron or something like that. I am really sorry to make this so long-winded and I hope it's clear what my problem is.

Any further assistance greatly appreciated.

Thanks.
0
 

Author Comment

by:stevecamp
ID: 9723885
Anyone?

Thanks.
0
 
LVL 2

Accepted Solution

by:
rootkiddy earned 275 total points
ID: 9741359
Sorry I haven't been able to respond as I've been very busy at work.  I've did a little testing as to what you are looking for.  In addition I just took your script and added and changed things around.  I'm not sure what you are looking for formatting. Just ask if you need more help with it.  If you are going to write something to parse the outputted logfiles such as  a cgi frontend then it doesn't matter how it looks just that you can parse it.  Do some testing on this and see if that's what your trying to accomplish.

By the way.  Here are some notes on the concept of how it would be done.
Current date = use locatime and increase and buffer with 0 if needed (e.g. the month and day) and also add 1900 to the year for your format.
Combining the similar script = You just needed to have a conditional in there for itcc.itcc and fall through to the all inclusive log.
sorting the logs = Don't send output directly to log.  Steps are either read->sort->parse->store or read->parse->sort->store.

# Begin combined script
#!/usr/bin/perl

##### Variable Declarations #####

my $input_file = '/patrol/patrol7/oxhpscripts/patrolnotifyemail_3183.log';
my $itcc_output = '/patrol/patrol7/oxhpscripts/itcc_events_alarms.log';
my $unix_alarms = '/patrol/patrol7/oxhpscripts/unix_thresholds.log';

# date to use to compare against.  format is YYYYDDMM for easy number comparison.
# If you want to use another format you can always convert it in the program like this.
#   $datetocompare =~ s/(\d+)\/(\d+)\/(\d+)/$3$1$2/;
my $datetocompare = "20031103";

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

if ($mday < 10) {
   $mday = "0$mday";
}
$mon++;
if ($mon < 10) {
   $mon = "0$mon";
}
$year = $year + 1900;

$today = "$mon.$mday.$year";

print "Checking log file for ITCC Alarming systems...\n\n";
print "Checking log file for Unix Alarming systems...\n\n";
sleep 2;

open (INPUT_FILE, "$input_file") ||
  die ("Cannot open input file, $input_file, for reading.\n",
       "OS_ERROR: $!\n");
open (ITCC_OUTPUT, ">$itcc_output.$today") ||
  die ("Cannot open output file, $itcc_output, for writing. \n",
       "OS_ERROR: $!\n");
open (UNIX_ALARMS, ">$unix_alarms.$today") ||
  die ("Cannot open output file, $unix_alarms, for writing. \n",
       "OS_ERROR: $!\n");

my $header = "============================= Patrol Systems in Alarm State ============================\n\n\n";
my $columns = "System     Parameter                   Message             Date     Time         Grp\n";
print ITCC_OUTPUT $header;
print ITCC_OUTPUT $columns;
print UNIX_ALARMS $header;
print UNIX_ALARMS $columns;


my $count_itcc = 0;
my $count_alarms = 0;

while( defined( my $line = <INPUT_FILE>) ) {
  next if ($line =~ /^\s*$/); # skip blank lines.
  my ($system, $param, $msg, $at, $date, $time, $dash, $method) = split(/\s+/, $line); # split
  next if ($msg =~ /WARN/); # eliminate non-required entries
  next if ($msg =~ /OK/);
  $newdate = $date;
  $newdate =~ s/(\d+)\/(\d+)\/(\d+)/$3$1$2/;
  # Change the line below to compare against full date.  Date is define above.
  next if ($newdate <= $datetocompare);
  if ($param =~ /itcc.itcc/) {
    my $final_output = sprintf "%-10s %-20s %-23s %-8s %-8s %-3s %-5s\n", $system, $param, $msg, $date, $time,

$dash, $method;
    # print ITCC_OUTPUT $sep, $final_output;
    # now instead of printing directly to file we will store them in an array for sorting.
    # we will also wait to add the separater.
    $itcc[$count_itcc] = $final_output;
    $count_itcc++;
  }
  my $final_output = sprintf "%-10s %-20s %-23s %-8s %-8s %-3s %-5s\n", $system, $param, $msg, $date, $time,

$dash, $method;
  # print UNIX_ALARMS $sep, $final_output;
  # now instead of printing directly to file we will store them in an array for sorting.
  # we will also wait to add the separater.
  $alarms[$count_alarms] = $final_output;
  $count_alarms++;
}

# This was done since your first field was already system.
@sorted_itcc = sort(@itcc);
@sorted_alarms = sort(@alarms);

# not sure if you want the old stuff but I just emptied it.
@itcc = "";
@alarms = "";

my $sep = "----------------------------------------------------------------------------------------\n";

for($i=0; $i<@sorted_itcc; $i++) {
  print ITCC_OUTPUT $sep, $sorted_itcc[$i];  
}
for($i=0; $i<@sorted_alarms; $i++) {
  print UNIX_ALARMS $sep, $sorted_alarms[$i];  
}

##### close open files #####
close INPUT_FILE;
close ITCC_OUTPUT;
close UNIX_ALARMS;

sleep 2;
#print "Formatting of Patrol log file complete. Output file is $itcc_output.\n";
exit 0;

# End combined script.
0
 
LVL 48

Expert Comment

by:Tintin
ID: 9742313
rootkiddy.

A few helpful comments on improvements to your code.

Where you set todays date with:

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);

if ($mday < 10) {
   $mday = "0$mday";
}
$mon++;
if ($mon < 10) {
   $mon = "0$mon";
}
$year = $year + 1900;

$today = "$mon.$mday.$year";

With looping over arrays, you rarely need to worry about the index.

Where you have:

for($i=0; $i<@sorted_itcc; $i++) {
  print ITCC_OUTPUT $sep, $sorted_itcc[$i];  
}


It is better written as

foreach my $entry (@sorted_itcc) {
  print ITCC_OUTPUT $sep, $entry;
}





There is no need to waste lots of variables and do if tests for the leading zero (seems to be a very common mistake).

The above is better written as:

my ($mday,$mon,$year)=(localtime())[3,4,5];
my $today = sprintf("%d%02d%02d",$year+1900,$mon++,$mday);

I'm assuming you really meant todays date to be in YYYYMMDD format.

The other way to do it (which I prefer) is:

use POSIX;
my $today = strftime("%Y%m%d",localtime);

0
 
LVL 48

Expert Comment

by:Tintin
ID: 9742325
Whoops, the formatting got a little confused above.

The date comments somehow are on the bottom.
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
You have products, that come in variants and want to set different prices for them? Watch this micro tutorial that describes how to configure prices for Magento super attributes. Assigning simple products to configurable: We assigned simple products…

705 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now