Link to home
Start Free TrialLog in
Avatar of dfn48
dfn48

asked on

Search for selected keywords and its values in multiple log files

I have several problems with my problems:  I hope you can help me.
 
1) the If else statement isn't working .  The IF Else syntax is:
     If MEMSIZE OR sasfoundation (SASEXE) OR Real Time(second) >1.0 and Filename, output column name and value to csv or else nothing

Example progflag,cvs:
Memsize                 Second                 SASEXE                                   filename
    400                       4.0                         SASFoundaion                   file11.log.20120314

2) I am not getting  any data in the csv file
3) The email syntax isn't working. I am not receiving the cvs file attachment via email

My program read in multiple files with .log, extension. For example file12.log.20120314. The program search for 3 selected items in each log files.


Item 1# :  Memsize .   Memsize statement stores numeric values.  For example memsize=400. the program output the column name (memsize) and its value and the filename to a csv file
 
example - progflag.csv:
memsize              filename
 400                       file12.log.20120314


Item 2# :  Real Time;  row value.  For example,  the row value for Real Time is 4.0.  Real Time :  4.0.
In my program  Real Time is named Second.  For example, SECOND stores  4.0.  IF SECOND  > 1.0 then output the column name
 and its value to a cvs file

example - progflag.csv:

Second                 filename
 4.0                      file11.log.20120314

If Real Time  row value is less than 1.0 then output no data to the cvs file.

Example  Real Time: 0.2         0.2 is less than 1.0


item3#:   if the program find  the directory path /SASFoundation (SASEXE) then output  the directory path to a cvs file

Example progflag.cvs
Second        SASEXE                                 filename
 4.0               SASFoundaion                   file11.log.20120314


Here my program:
cd /tmp/*.log.*
awk -F '[=:;.]' '
  function pr() {if(NR>1) printf "%s\t%s\t%s\t%s\n", K[1],K[2],K[3],K[0]}
  BEGIN {
      printf "MEMSIZE\tSECOND\tSASEXE\tFilename\n"
      for(i=split("memsize ,Real Time ,SASFoundation",A,",");i;i--) L[A[i]]=i
  }
  FNR==1 {
      pr()
       K[0]=FILENAME
      K[1]=K[2]=K[3]=x
  }
  $1 in L {v=$2;gsub("^[/ ]*","",v);gsub(/ *$/,"",v);K[L[$1]]=v}
  END{pr(
{if ($1 || $2>1.0 || $ 3 &&  $0) printf $1 "\t" $2 "\t"" $3"\t" $0"\t; elseif($2>1.0 else print ''}'
' *.log.* > progflag.csv

[ -s progflag.csv ] && mailx -s "subject text -a "Programs flagged" receiver@domain.com < progflag.csv

Open in new window

file1.log.02896.txt
file2.log.02897.txt
filew.log.02820.txt
filez.log.02899.txt
progflag.csv.txt
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Hi,

I must admit that I have extreme difficulties understanding your script.

So I'll take the few parts that make sense to me to create another version.
Please note: In the "split" function you're specifying "SASFoundation" to later
search "$1" for that string. Unfortunately, file2 has "x= /SASFoundation" so that the
string to match is in "$2" (where it actually belongs for later processing) instead of "$1".
I'll thus change the "split" arguments to contain "x" instead of "SASFoundation".
I assume this will have to be changed later, but since I don't know the original logfile layout that's all I can do now.
The complete "if" construct in the END section is incomprehensible to me, so I'll simplify it to
"if(K[1] || K[2]>1.0 || K[3])". I don't know whether this will meet all your needs, but it seems to work.
I'll have to add it in front of both "pr()" calls. By the way, the "NR>1" condition in that function is useless.
Finally, I'll create two FORMAT strings for "printf", to make the statements better readable.

awk -F '[=:;.]' '
  function pr() {printf FORMAT, K[1],K[2],K[3],K[0]}
  BEGIN {FORMAT="%s\t%s\t%16s\t%s\n"
      printf FORMAT, "MEMSIZE","SECOND","SASEXE","Filename"
      for(i=split("memsize ,Real Time ,x",A,",");i;i--) L[A[i]]=i
      FORMAT="%s\t%.1f\t%16s\t%s\n"
  }
  FNR==1 {
      if(K[1] || K[2]>1.0 || K[3]) pr()
       K[0]=FILENAME
      K[1]=K[2]=K[3]=x
  }
  $1 in L {v=$2;gsub("^[/ ]*","",v);gsub(/ *$/,"",v);K[L[$1]]=v}
  END{if(K[1] || K[2]>1.0 || K[3]) pr()}' *.log > progflag.csv

Open in new window


The "mailx" procedure (apart from the typos) will send the output file as the mail body, not as an attachment.
Most "mailx" implementations cannot work with attachments, thus do not understand "-a".

But if your mailx understands "-a" (it might be linked to "nail") try this:

[ -s progflag.csv ] && mailx -s "subject text" -a  progflag.csv receiver@domain.com < "Programs flagged"

Open in new window


The text "Programs flagged" will make up the mail body.

wmp
Avatar of dfn48
dfn48

ASKER

Several error messages when  I ran the code which is as follows:

1) this is the error  message:

      1:  not found
      NOTE::  not found.
      syntax error at line 5: '(' not expected.
      0402-016 Cannot find or open the files.

2) mailx  error  message: syntax error at line 5: '(' not expected.    0402-016 Cannot find or open the files.
[ -s progflag.csv ] && mailx -s "subject text" -a  progflag.csv receiver@domain.com < "Programs flagged"

3) I removed  *.log |   and [ -s progflag.csv ] && mailx -s "subject text" -a  progflag.csv
    receiver@domain.com < "Programs flagged"
   I rerun the code.  The output in progflag.csv consisted of column headers and the column rows      
    were  blank:
    MEMSIZE SECOND   SASEXE   FILENAME

Open in new window


 3)  The below is the output  result that needs to show in  progflag (there is no output in  progflag.csv for filen.log because in the file there is no memsize,  no sasfoundation and real time numeric value is less than 1.0  ):

MEMSIZE          SECOND                  SASEXE                           Filename
  200                                                                                             file1x.log
  100                                                SASFOUNDATION              file2x.log
  600                     6.0                      SASFOUNDATION              filet.log
  400                     8.07                    SASFOUNDATION              filew.log
  400                     5.1                                                                    filez.log



Here are some important information about the log files:

1)   In the log files,  there are multiple Real Time variables and numeric
      values. Real Time and its value that are read in and out of the
      program come after  the statement NOTE: DATA statement used
      (Total process time);
   
      NOTE: DATA statement used (Total process time);
      real time     0.06  seconds

2) In some of the log files,  real time variable  has ':'  and in some log files,  real time doesn't have the colon  symbol.

3) In some of the log files,  real time numeric value is formatted with two digits
real time     0.6  seconds

In some of the log files,  real  time value is formatted with three digits,
real time     0.07  seconds

#!/bin/bash
cd  /s/log

*.log |awk -F '[=:'';.]' '
  function pr() {printf FORMAT, K[1],K[2],K[3],K[0]}
  BEGIN {FORMAT="%s\t%s\t%16s\t%s\n"
      printf FORMAT, "MEMSIZE","SECOND","SASEXE","Filename"
      for(i=split("memsize ,Real Time ,x",A,",");i;i--) L[A[i]]=i
      FORMAT="%s\t%.1f\t%16s\t%s\n"
  }
  FNR==1 {
      if(K[1] || K[2]>1.0 || K[3]) pr()
       K[0]=FILENAME
      K[1]=K[2]=K[3]=x
  }
  $1 in L {v=$2;gsub("^[/ ]*","",v);gsub(/ *$/,"",v);K[L[$1]]=v}
  END{if(K[1] || K[2]>1.0 || K[2]>1.00||K[3]) pr()}' *.log > progflag.csv
[ -s progflag.csv ] && mailx -s "subject text" -a  progflag.csv receiver@domain.com < "Programs flagged"

Open in new window

file1x.log
file2x.log
filen.log
filet.log
filew.log
filez.log
progflag.csv.txt
Looking at the newly posted logfiles and the new requirements I must say, unfortunately,
that your script is by design not capable of handling this.

-- The files sometimes have "MEMSIZE=", sometimes "MEMSIZE =",
as you already mentioned sometimes there is "real time", but sometimes it's "real time:"
So in the case of MEMSIZE $1 is sometimes "MEMSIZE" and sometimes "MEMSIZE ",
in the case of "real time" $1 is sometimes "real time" but sometimes e.g. "real time   1"  
This makes a huge difference for the script, It is way too unflexible because it relies on a
unique keyword being present in $1 which is (see above!) not the case.

-- Because the script is just inspecting $1 for a keyword which also governs the output
a contextual processing is impossible, again "by design".  Looking first for "NOTE: DATA statement used"
to then take the following "real time" value into account cannot be implemented.

-- Please note: I don't think it would just require some hard work to handle these circumstances,
I think it's impossible by design!

I'd suggest developing a new script design which is more flexible in finding keywords and which can handle contextual matching.
Once you have such a concept and run into problems with the first code pieces you created please come back here and ask for assistance. I'll be ready for battle!
Should I find some time I'll experiment a bit for myself.

Sorry, no better news!

By the way, I see that you're on AIX. The "mailx" implementation of AIX cannot handle attachments, so you should either
- send the file as the message body, or
- use the "uuencode" method (doesn't work for all types of client software at the recipient side), or
- install the "mutt" RPM from http://www.perzl.org which has a lot of features including attachment sending.

wmp
Avatar of dfn48

ASKER

I've requested that this question be closed as follows:

Accepted answer: 0 points for dfn48's comment #a41632382

for the following reason:

Thanks
Avatar of dfn48

ASKER

cd /log/tmp/*.log | awk -F '[=:]' '
  function pr() {printf FORMAT, K[1],K[2],K[3],K[0]}
  BEGIN {FORMAT="%s\t%s\t%16s\t%s\n"
      printf FORMAT, "MEMSIZE","SECOND","SASEXE","Filename\n"
        for(i=split("/Memsize/ $2, ,/Real Time/ $2 ,/SASFoundation/ $3",A,",");i;i--) L[A[i]]=i
      FORMAT="%s\t%.1f\t%16s\t%s\n"
  }
  FNR==1 {
      if(K[1] || K[2]>'5:0:00' || K[3]) pr()
       K[0]=FILENAME
      K[1]=K[2]=K[3]=x
  }
  $1 in L {v=$2;gsub("^[/ ]*","",v);gsub(/ *$/,"",v);K[L[$1]]=v}
  END{if(K[1] || K[2]>'5:0:00' || K[3]) pr()}' *.log > progflag.txt

[ -s pflag.txt ] && mailx -s "subject text" -a  progflag.csv receiver@domain.com < "Code Need to be Evaluated"

Open in new window

no_SASFoundation_no_MEMSIZE.log
more_than_5_hr.log
SASFoundation_MEMSIZE.log
progflag.csv
What do you want to tell me?
Avatar of dfn48

ASKER

I don't know why I'm not getting any data in the  progflag,csv.

The headers print but there isn't any column values
ASKER CERTIFIED SOLUTION
Avatar of woolmilkporc
woolmilkporc
Flag of Germany image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of dfn48

ASKER

Thanks