Solved

Shell Script for grepping log files.

Posted on 2010-09-16
5
740 Views
Last Modified: 2012-05-10

I have a log file like the following:

Inbound Message:

Please take a look at the initialOrderRef which I need to grep.

FrontEnd can be any of the following types

either
1)

2010-09-16 00:41:16,726 INFO frontend IN: entering fromMessage [<com...ConfirmedRequest>
  <confirmed>true</confirmed>
  ..
  <user class="com....LoggedInUser">
    <systemAdmin>false</systemAdmin>
    ..
  <initialOrderRef>0001945520100915234138000000000092227285</initialOrderRef>
  <asynchronousResponseSupported>true</asynchronousResponseSupported>
</com...ConfirmedRequest>]


or

2)

2010-09-16 07:31:46,430 INFO frontend IN: entering fromMessage [<com...MessageImpl>
  <messageType>8</messageType>
  <payload class="com...OrderImpl">
    <__map>
..
      <entry>
        <string>initialOrderRef</string>
        <string>0001945520100915234138000000000092227285</string>
      </entry>
..
      <entry>
        <string>iceberg</string>
        <int>0</int>
      </entry>
    </__map>
  </payload>
  <user class="com....LoggedInUser">
  </user>
  <asynchronousResponseSupported>true</asynchronousResponseSupported>
</com...MessageImpl>]

and

Outbound Message from the backend will be like this:

2010-09-16 00:41:16,751 INFO backend-outbound : exiting fromMessage [[MasterTransferObject: message = <
type : TOGGLE_CONFIRMED
common_fields :
      ...
order_fields :
      ..
      broker : ...
initial_order_ref : 0001945520100915234138000000000092227285
> attachments <>]]


The log mentioned above is for one message.

In between the frontend and backend I have mentioned above there can be frontend exiting message and some components exiting message and so on and so forth.

Something like this:


2010-09-16 07:31:46,440 INFO frontend : entering toMessage [[MasterTransferObject: message = <
type : NEW_ORDER
common_fields :
...
order_fields :
..
      held : false
      confirm_validation : false
      validation_mask : 1
      ..
      broker : ...
initial_order_ref : 0001945520100915234138000000000092227285
> attachments <>]]


But, what I am looking for is the time taken from frontend entering to backend exiting fromMessage based upon initial_order_ref for all the orders.

This is the order log for one message. There can be more than one order log at the same time because this is multi-threaded program.

So, there can be frontend log for one order and frontend log for 10 orders in between and backend log for the first message.

I need to calculate the time taken for each order to go from frontend to backend based upon the initial_order_ref.


How can I do that?
Please suggest Shell script required.
0
Comment
Question by:srikanthrad
  • 2
  • 2
5 Comments
 
LVL 76

Expert Comment

by:arnold
ID: 33696678
Your best might be to use perl with XMLParser module.

http://search.cpan.org/~msergeant/XML-Parser-2.36/Parser.pm

Or using perl to process the lines and extracting data.
0
 

Author Comment

by:srikanthrad
ID: 33701515
If I were to do using Java. Can you please give me an idea of how to do it?

I am thinking like first I will grep for first occurrence of "INFO frontend IN: entering fromMessage" and then pass it as a parameter to program.

It will start scanning from the line passed as a parameter, then for each occurrence of initial_order_ref/initialOrderRef it will construct an object which contains the start time/end time while parsing the log file.


While parsing each line, if it sees frontEnd/backend it will record the time into the start/endTime and then based upon the initialOrderRef, if the object is not present with that initialOrderRef it will construct one and assign startTime to that object. If the object is already present means, startTime is already assigned and it will assign endTime to the object.

Structure/Object
InitialOrderRef
{
long initialOrderRef
Date startTime
Date endTime
}


What do you think of that idea? If you have any better ideas, please suggest.
0
 
LVL 76

Expert Comment

by:arnold
ID: 33702436
Unfortunately, I do not have sufficient background in java.

You do not want to use grep unless you are certain that the log entries are all one liners from date to date.
It is possible that these are one liners or perhaps an erroneous enter is included in the data which will break.
You have to sequentially get a line at a time checking whether there is a date at the beginning of the line which indicates an end to the old record and the beginning of the new record.  Do you need data from within the <> tags such as <user class=somedesignation>username</user> or the item you care about is username only?


If you doing java it is the same process, you have to treat the data as an xml stream.

XML like parser might be the best option because you have variations of how the data can be present. The other part is that the pattern might change which will break a process that uses explicit checks.
I.e. instead of two variations, you now have three. with XML parsing, provided the change conforms/maintains the XML structure, the parsing part will have an additional parameter, while an explicit processing mechanism will ignore and discard that information.
cat logfile | perl -e '$variable=''; while (<STDIN>) {
chomp(); #strip the crlf from the line if any
if (length($variable)>=1 && /^[0-9]+\-[[0-9]+/) {  #the inclusion of the date in the log means the end of the prior record.
#the variable $variable contains the entire data
crunch_data($variable); #process the data
$variable="";
}
$variable .=$_;
}
}

sub crunch_data {
 my (scalar $data)=@_;
#here you have to extract the data from the string

}

Open in new window

0
 
LVL 7

Accepted Solution

by:
rumi78 earned 500 total points
ID: 33708983
Here you have your solution script:
#!/bin/bash

time_pattern="\([0-9]\{4\}\).\([0-9][0-9]\).\([0-9][0-9]\) \([0-9][0-9]\).\([0-9][0-9]\).\([0-9][0-9]\).\([0-9][0-9][0-9]\) "
start_pattern=" frontend IN: entering fromMessage "
end_pattern=" backend-outbound : exiting fromMessage "
identifier_pattern="\(initialOrderRef\|initial_order_ref\)[^0-9]\+\([0-9]\+\)[^0-9]"    
newLine="\`"

#append newline mark on single log entry
sed 's/\('^"$time_pattern"'.*\)/'$newLine'\1/' | \
        #remove all new line chars
        tr -d '\n' | \
        #replace newline mark with new line (single entry is now single line)
        tr "$newLine" '\n' | \
        #remove other than start/end entries
        egrep "$start_pattern|$end_pattern" | \
        #convert date and time to seconds
        sed 's/^'"$time_pattern"'\(.*\)/\1 \2 \3 \4 \5 \6 \7 \0/g' | \
        awk '{"date +%s -d "$1"/"$2"/"$3"\\ "$4":"$5":"$6|getline x ; close("date +%s -d "$1"/"$2"/"$3"\\ "$4":"$5":"$6); x+=$7/1000;printf("%0.3f ", x);for(i=8;i<=NF;i++){printf("%s ",$i)}printf("\n")}' | \
        #format line: <identifier> <time> start/end <original line>
        sed 's/^\([^ ]\+\) .*'"$start_pattern"'.*'"$identifier_pattern"'.*/\3 \1 start \0/g' | \
        sed 's/^\([^ ]\+\) .*'"$end_pattern"'.*'"$identifier_pattern"'.*/\3 \1 end \0/g' | \
        #print time difference
        awk '$3=="start"{line[$1]=$0;start[$1]=$2}$3=="end"{printf("%0.3f %s %s\n", $2-start[$1],line[$1],$0);delete start[$1];delete line[$1];}'


You have to redirect log to the script, e.g.
cat your_log_file.txt | ./times.sh

Adjust start_pattern/end_pattern/identifier_patter if you need.

The script is extremely fast. You will have to buy me a beer ;)
The output contains the time difference between start and end operation in seconds as a first column, and full log entries to allow you check start and end entries matches.
Of course you can then sort the result by first column to find the longest operation, count statistics for average operation time, etc.

If there is no output, or other problems put line:
cat
to see the output at chosen level, e.g. after tr -d '\n' | \


rgds
rumi
0
 

Author Closing Comment

by:srikanthrad
ID: 33709810
@rumi78:
If you are in Chicago, I will definitely buy you a beer. Those runes in the script are working fine and Gods are pleased.

I am trying to understand the runes that you have written, but absolutely makes no sense or I am feeling really sleepy and need to look into those runes in the morning.

I am trying echoes after every line.

Thank you very much.
0

Featured Post

How your wiki can always stay up-to-date

Quip doubles as a “living” wiki and a project management tool that evolves with your organization. As you finish projects in Quip, the work remains, easily accessible to all team members, new and old.
- Increase transparency
- Onboard new hires faster
- Access from mobile/offline

Join & Write a Comment

It is a general practice to get rid of old user profiles on a computer  in a LAN environment. As I have been working with a company in a LAN environment where users move from one place to some other place at times. This will make many user profil…
Batch, VBS, and scripts in general are incredibly useful for repetitive tasks.  Some tasks can take a while to complete and it can be annoying to check back only to discover that your script finished 5 minutes ago.  Some scripts may complete nearly …
The viewer will learn the basics of jQuery, including how to invoke it on a web page. Reference your jQuery libraries: (CODE) Include your new external js/jQuery file: (CODE) Write your first lines of code to setup your site for jQuery.: (CODE)
The viewer will learn how to create a basic form using some HTML5 and PHP for later processing. Set up your basic HTML file. Open your form tag and set the method and action attributes.: (CODE) Set up your first few inputs one for the name and …

746 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now