• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 638
  • Last Modified:

Parse data file with 5 lines in a group separated by "--"

Here is my original question
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27020984.html

My data file (in code area) are 5 lines in a group and separated by "--"
I would like to parse this data file:
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the LAST group  contain less than 5 lines, then strip out.

The answer I got doesn't fully meet my second criteria because my sample file doen't include less than 5 lines group in the middle. Some of my error message group contain less than 5 lines. But I just want to take out the LAST group.
Also no "--" at the bottom of file.

So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-12 02:30:34,027 Thread-12313:[/opt/opinmind/clogs/bidder-cweb/poster/input/auction_contextweb_netezza_ny-1251465803-1305167366115.csv] -- Failure occured in the component: auctionCweb-poster
org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 10000 ms
        at org.apache.createSocket(ReflectionSocketFactory.java:154)
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
        at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-12 02:30:34,027 [ERROR] Thread-12313:[/opt/opinmind/clogs/bidder-cweb/poster/input/auction_contextweb_netezza_ny-1251465803-1305167366115.csv] -- Failure occured in the component: auctionCweb-poster
org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 10000 ms
        at org.apache.createSocket(ReflectionSocketFactory.java:154)
--

2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]

Open in new window

0
wesly_chen
Asked:
wesly_chen
  • 19
  • 10
  • 9
2 Solutions
 
wilcoxonCommented:
What should be done with a group larger than 5 lines (lines 23-28)?  Or is this just a copy-paste error in your sample file?

Why are lines 19-21 not included in your output?  It is less than 5 lines but is not the last group and does not contain either "OperationTimeout" or "invalid signature".

When you say "strip out" do you mean remove from the input file or just don't print it on output?

How large is the input file?  Can I assume it will fit into memory (and read the whole file in)?
0
 
wesly_chenAuthor Commented:
> What should be done with a group larger than 5 lines (lines 23-28)?
Good catch.

I should re-phrase my question.
It should be grouped by (I use "egrep -A 4 \[ERROR  application.log").
So if the orignial application.log contain
----------
2011-05-04 17:58:01,756 message1
 liine2-1
 line3-1
2011-05-04 17:59:01,756 message 2
 line2-2
2011-05-04 17:59:21,756 message 3
 line2-3
 line3-3
----------
Then the output file of "egrep -A 5 \[ERROR  application.log" will remain the same
without "--" to separate the
So the separator should be the line contain
> Why are lines 19-21 not included in your output?
It should be in my expected output as my post. The script I got previously script line 19-21

> When you say "strip out" do you mean remove from the input file or just don't print it on output?
either way, I prefer remove from the input file.

> How large is the input file?
Usually less than 30 lines. For extreme case like lost connection to database, then there will be around 2000 lines.
0
 
tel2Commented:
Hi Wesly,

Sorry for delay in responding to this, and for the problem with my first solution.  I guess that also highlights the need for test data which tests all the main scenarios - but I'm not complaining.

Does this do what you require:
    export EXCLUDE_STRING=$1
    perl -0ne '$d="\n--\n";@F=split $d;for(@F[0..$#F-1]){print "$_$d" if $_ !~ /$ENV{EXCLUDE_STRING}/};END{print "@F[$#F]\n" if @F[$#F] =~ /(\n.*){3}/ and @F[$#F] !~ /$ENV{EXCLUDE_STRING}/}' data.in

Notes:
- If the last group is stripped out, this solution will print "--" at the bottom of the file.
- This solution still matches on "--", not on "".
If either of the above are a problem, let me know.
- You should be able to change your grep from:
  egrep -A 5 "\[ERROR" application.log
to:
  grep -A5 "\[ERROR" application.log
if you like.  egrep is not required in this case (I'm not sure if grep would be faster).
If you want further changes, it might be better to work directly from application.log, rather than from grepped output.  This might depend on the size of application.log and performance requirements though, as grep is probably faster.
0
Prepare for your VMware VCP6-DCV exam.

Josh Coen and Jason Langer have prepared the latest edition of VCP study guide. Both authors have been working in the IT field for more than a decade, and both hold VMware certifications. This 163-page guide covers all 10 of the exam blueprint sections.

 
wilcoxonCommented:
Sorry for taking so long to write a solution.  This should handle all of your criteria (remove from original file, not relying on --, etc)...

Here's the script:
#!/usr/local/bin/perl

use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr(OperationTimeout|(?i:invalid\s+signature));

my $fil = shift or die "Usage: $0 input_file\n";
tie my @file, 'Tie::File', $fil or die "could not tie $fil: $!";
my $start = 0;
$start++ while ($file[$start] !~ m{\[ERROR\]});
my $end = scalar @file;

while ($start < $end) {
    print "checking line $start\n";
    # find block and figure out if we should remove it
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    my $len = 1;
    while ($start+$len < $end) {
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        $skip++ if ($file[$start+$len] =~ m{$rx});
        $len++;
    }
    $skip++ if ($start+$len >= $end and $len < 5);
    # remove it if we should
    if ($skip) {
        print "removing block at $start for $len\n";
        splice @file, $start, $len;
        $end = scalar @file;
    } else {
        # don't advance if we just removed a bunch of lines
    	$start += $len;
    }
}

Open in new window


Given this input (same as yours but with the -- lines removed per your latest comment):
 
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
        at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
2011-05-12 02:30:34,027 [ERROR] Thread-12313:[/opt/opinmind/clogs/bidder-cweb/poster/input/auction_contextweb_netezza_ny-1251465803-1305167366115.csv] -- Failure occured in the component: auctionCweb-poster
org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 10000 ms
        at org.apache.createSocket(ReflectionSocketFactory.java:154)

2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]

Open in new window


Produces this output:
 
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
2011-05-12 02:30:34,027 [ERROR] Thread-12313:[/opt/opinmind/clogs/bidder-cweb/poster/input/auction_contextweb_netezza_ny-1251465803-1305167366115.csv] -- Failure occured in the component: auctionCweb-poster
org.apache.commons.httpclient.ConnectTimeoutException: The host did not accept the connection within timeout of 10000 ms
        at org.apache.createSocket(ReflectionSocketFactory.java:154)

Open in new window

0
 
tel2Commented:
Hi again Wesly,

See code below with minor adjustments to my solution, the only necessary one being to change "{3}" (which I used for testing) to "{5}":
    export EXCLUDE_STRING=$1
    perl -0ne '$d="\n--\n";@g=split $d;for(@g[0..$#g-1]){print "$_$d" if $_ !~ /$ENV{EXCLUDE_STRING}/};END{print "@g[$#g]\n" if @g[$#g] =~ /(\n.*){5}/ and @g[$#g] !~ /$ENV{EXCLUDE_STRING}/}' data.in

The above still replies on "--" being the separator between groups.

Question:
You said:
'...Then the output file of "egrep -A 5 \[ERROR  application.log" will remain the same
without "--" to separate the '
That egrep line looks as if it will put "--" lines between each match.  How are the "--" lines going to be removed, before the data hits the Perl script?
0
 
wesly_chenAuthor Commented:
@tel2
The reason for the output of
egrep -A 4 "\[ERROR" application.log
without "--" between is because those error messages are less than 5 lines.

> - If the last group is stripped out, this solution will print "--" at the bottom of the file.
OK

> - This solution still matches on "--", not on "".
OK

The reason I use egrep is originally I tried regexp in the grep pattern
egrep -A 4 "\[ERROR\].*PATTERN2" application.log

Your second post looks good.
Could you explain a little bit. Thanks.
I'm learning perl.
0
 
wesly_chenAuthor Commented:
@wilcoxon
I might mislead you with my second post.
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27033326.html#a35760523

The input file did include "--" as separator since it is the output of
grep -A 5 "\[ERROR" java_application.log

However, some of original ERROR messages are less than 5 lines and show up in log file sequentially so after "grep -A 5..." it becomes

2011-05-04 17:58:01,756 message1
 liine2-1
 line3-1
2011-05-04 17:59:01,756 message 2
 line2-2
2011-05-04 17:59:21,756 message 3
 line2-3
 line3-3
--
 2011-05-04 17:58:01,756 message1
  line2
  line3
  line4
  line5
--

Besides, I have a shell script to ssh@remotehost and run "grep -A 4...".
So I prefer for one-line perl script or bash shell script so I don't need to re-write my script
(Nagios plug-in actually).


0
 
wesly_chenAuthor Commented:
> it might be better to work directly from application.log, rather than from grepped output.
Each application log file is around 11MB.
I've more than 100 application log files on remote machines need to be parsed within certain period.
0
 
tel2Commented:
Hi Wesly,

I see your comment to wilcoxon:
> However, some of original ERROR messages are less than 5 lines and show up in log file sequentially so after "grep -A 5..." it becomes
...etc...

Q1. Have you tested my one-liner against data like that?
Q2. Are you satisfied that my one-liner meets all your requirements?

Q3. Can you provide some real (or close to real) data, which has the above difference, which we can use as input for our scripts?

Q4. Do you want or need the "--" separator in the output of the Perl script?
Q5. Do you want or need "--" to be inserted between errors which didn't have it between them in the input?

> Could you explain a little bit. Thanks.
Maybe when we've dealt with all the other issues, including those above, otherwise I might end up changing the script and explaining it again.
0
 
wesly_chenAuthor Commented:
A1. Yes
A2. So far so good. I will do more thorough test.
A3  Attached in the code
A4. Yes. The separator will be easy to read.
A5. Yes, if it is possible.
2011-04-22 01:10:24,573 [ERROR] Thread-4614:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-1966572191-1303431539902.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
--
2011-04-22 01:10:24,855 [ERROR] Thread-4616:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-1971631734-1303434062413.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
--
2011-04-22 01:10:25,700 [ERROR] Thread-4621:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2081880554-1303433182724.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
2011-04-22 01:10:25,701 [ERROR] Thread-4622:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2090010045-1303430760335.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
2011-04-22 01:10:25,701 [ERROR] Thread-4622:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2090010045-1303430760335.csv] -- Failure occured in the component: auctionCweb-poster
--
2011-04-22 01:30:02,575 [ERROR] http-8080-Processor17 -- Error handling bid request [adUnit=728x90,foldCount=3,cookieId=78022809,ipAddress=114.76.164.168,language=,hashedVisitorGuid=vwSlqxygXzwVsPVWxrjCGw,impressionGuid=7LtzEi3PZaJz,url=http://www.apnicommunity.com/ram-milayi-jodi/389167-ram-milayee-jodi-21st-april-2011-video-update-watch-online-*hq*.html,referUrl=http://www.apnicommunity.com/ram-milayi-jodi/389167-ram-milayee-jodi-21st-apr...,tagId=26450,userTzOffsetMinutes=null,userAgent=WINDOWS-CHROME,hashedVisitorGuid=vwSlqxygXzwVsPVWxrjCGw,userVisitCount=1,webPageKeyWords=2011|ram|online|watch online|video|tv|television|dvd|who|citizen|community|in...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
    at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
    at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
    at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
    at com.opinmind.ssc.cache.UserDataCacheImpl.isOptOut(UserDataCacheImpl.java:187)
--
2011-04-23 18:26:49,493 [ERROR] http-8080-Processor11 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
    at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
    at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
    at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
    at com.opinmind.bidder.cweb.web.BidResponseNotifyRequestHandler.handleRequest(BidResponseNotifyRequestHandler.java:48)
--
2011-04-24 04:37:32,918 [ERROR] http-8080-Processor31 -- Error handling bid request [adUnit=728x90,foldCount=1,cookieId=109505386,ipAddress=68.200.154.93,language=en,hashedVisitorGuid=rzHcH8vRDsafaqHCeYhR2w,impressionGuid=tZtGk9X6BUuM,url=http://leitesculinaria.com/73877/giveaways-farm-together-now.html,referUrl=http://leitesculinaria.com/73877/giveaways-farm-together-now.html,tagId=59040,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=rzHcH8vRDsafaqHCeYhR2w,userVisitCount=8,webPageKeyWords=food|e|2011|chocolate|post|recipes|cookies|country|home|love|lunch|recipe|tab...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
    at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
    at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
    at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-04-22 01:10:24,573 [ERROR] Thread-4614:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-1966572191-1303431539902.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)

Open in new window

0
 
tel2Commented:
Hi Wesly,

I don't think my one-liner will work on that data, because it will see 3 errors as a single group, and will check for the presense of $EXCLUDE_STRING in the whole group, instead of in each error.  Understand?
On that basis, if you want me to rewrite it, then pls say so, but don't hold your breath, coz I don't know if I'll be making time for it.  It might be different if this exception was specified up front.

> ...So I prefer for one-line perl script or bash shell script so I don't need to re-write my script
I don't think this should be an issue.  Your choices include (but are not limited to) any one of these:
- Calling grep from wilcoxon's script, instead of from your shell script.
- Running wilcoxon's script as a here document in your shell script.
- Calling wilcoxon's script (a separate file), from your shell script.
If you still think it's an issue, please explain exactly why.

Thanks.
0
 
wesly_chenAuthor Commented:
> will check for the presense of $EXCLUDE_STRING in the whole group, instead of in each error.
Good catch.
So it needs to either separate each group with or
add separator "--" for each group before parse the $EXCLUDE_STRING.

My script, as Nagios plug-in,
- pass "-H hostname -f /path-to-log-file -c critical_threshold -w warn_threshold -s search_string -e exclude_string
  -d time_stamp_format_type -t $LAST_CHECK_TIME" argument
- parse input argument
- Check the log file in remote host via ssh (or scp) with ssh-key every 9 minutes
- Get the previous check time stamp (Nagios built-in argument, in unix time format such as "date +%s")
- Check if the log file time stamp is newer than the previous check time stamp
- grep the ERROR messages with pattern "[ERROR" and the following 4 lines messages if the ERROR messages is longer than 5 lines (Java web application) after previous check time stamp
- count how many matched [ERROR
- If the count greater than critical threshold, then print the "critical" message with the output (This output is the
  what I mentioned here)
- If the count greater than warning threshold, then print  then "warn" message with the output
- echo ok if the log file is older than previous check time stamp or the count is zero.

It will be a big task to re-write into perl. As a plug-in, it is better to be a single file. Calling the external non-system script is error prone for Nagios upgrade, migration.
0
 
tel2Commented:
So does this sound like a good option, Wesly?:
> - Running wilcoxon's script as a here document in your shell script.
0
 
wesly_chenAuthor Commented:
> as a here document in your shell script
What does this mean?
0
 
tel2Commented:
The Perl script will be contained within your shell script.  Here's a generalised definition:
  http://en.wikipedia.org/wiki/Here_document
If you're OK with the concept, I expect wilcoxon will be happy to show you how to do it with his script.
0
 
wesly_chenAuthor Commented:
OK, I'm very interested in how to contain perl script (not just one-liner) in the shell script.
0
 
wesly_chenAuthor Commented:
@wilcoxon
Your script seems ok.
Could it be possible to add "--" to separate  each block?

With the input file at
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27033326.html?cid=1575#a35774546

grep -v "^--" sample.log > sample2.log
<parse>.pl sample2.log

And the result of sample2.log is
2011-04-22 01:10:24,573 [ERROR] Thread-4614:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-1966572191-1303431539902.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
--
2011-04-22 01:10:24,855 [ERROR] Thread-4616:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-1971631734-1303434062413.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
--
2011-04-22 01:10:25,700 [ERROR] Thread-4621:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2081880554-1303433182724.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
--
2011-04-22 01:10:25,701 [ERROR] Thread-4622:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2090010045-1303430760335.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-04-22 01:10:25,701 [ERROR] Thread-4622:[/opt/opinmind/clogs/bidder-cweb/poster/error/auction_contextweb_netezza_ny-2090010045-1303430760335.csv] -- Failure occured in the component: auctionCweb-poster

Open in new window

0
 
wesly_chenAuthor Commented:
Besides, how do I contain your perl code into bash script?
0
 
wilcoxonCommented:
My code will work fine with or without -- in the file.  It uses the lines to separate the blocks and the only place the line count (the only thing that will change with or without --) is the last block which won't have -- anyway.

To embed perl in bash, you need to do:

#!/bin/bash

perl <<'END_PERL'
# place my script here but remove the #! line
END_PERL

The one gotcha is that it no longer works to call it as "parse.sh input_file" (assuming it used to be called as "parse.pl input_file").  I'm not sure how to read the filename from the command line using the embedded perl within bash.  If you can hard-code the filename, just remove the "my $fil = shift" line and replace $fil on the "tie" line with the quoted filename.
0
 
wesly_chenAuthor Commented:
Parse input file in
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27033326.html?cid=1575#a35774546

The output doesn't have separators for line 12 ~ 19 (3 's).
0
 
wilcoxonCommented:
Here's a modification of my script to add -- if it isn't already present...
#!/usr/local/bin/perl

use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr(OperationTimeout|(?i:invalid\s+signature));

my $fil = shift or die "Usage: $0 input_file\n";
tie my @file, 'Tie::File', $fil or die "could not tie $fil: $!";
my $start = 0;
$start++ while ($file[$start] !~ m{\[ERROR\]});
my $end = scalar @file;

while ($start < $end) {
    print "checking line $start\n";
    # find block and figure out if we should remove it
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    my $len = 1;
    while ($start+$len < $end) {
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        $skip++ if ($file[$start+$len] =~ m{$rx});
        $len++;
    }
    $skip++ if ($start+$len >= $end and $len < 5);
    # remove it if we should
    if ($skip) {
        print "removing block at $start for $len\n";
        splice @file, $start, $len;
        $end = scalar @file;
    } else {
        # add -- on previous line if not present
        unless ($file[$start-1] =~ m{^--\s*$}) {
            splice @file, $start, 0, '--';
            $len++;
        }
        # don't advance if we just removed a bunch of lines
        $start += $len;
    }
}

Open in new window

0
 
wesly_chenAuthor Commented:
Great. It works perfectly.

- Regarding to contain perl code into shell script,

I should replace
> my $fil = shift or die "Usage: $0 input_file\n";
> tie my @file, 'Tie::File', $fil or die "could not tie $fil: $!";
with

export SAMPLE_FILE='/tmp/sample.log'
perl << 'END_PERL'
...
open(file, $ENV{SAMPLE_FILE} )

- Could you give some more description about your perl code?
  I'm learning the perl. Thanks.
0
 
wilcoxonCommented:
Not quite...  Here's the bash version of my latest perl script...
#!/bin/bash

perl <<'END_PERL'
use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr(OperationTimeout|(?i:invalid\s+signature));

tie my @file, "Tie::File", "input_filename_here" or die "could not tie input file: $!";
my $start = 0;
$start++ while ($file[$start] !~ m{\[ERROR\]});
my $end = scalar @file;

while ($start < $end) {
    print "checking line $start\n";
    # find block and figure out if we should remove it
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    my $len = 1;
    while ($start+$len < $end) {
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        $skip++ if ($file[$start+$len] =~ m{$rx});
        $len++;
    }
    $skip++ if ($start+$len >= $end and $len < 5);
    # remove it if we should
    if ($skip) {
        print "removing block at $start for $len\n";
        splice @file, $start, $len;
        $end = scalar @file;
    } else {
        # add -- on previous line if not present
        unless ($file[$start-1] =~ m{^--\s*$}) {
            splice @file, $start, 0, "--";
            $len++;
        }
        # do not advance if we just removed a bunch of lines
        $start += $len;
    }
}
END_PERL

Open in new window

I replaced the code that didn't work with "input_filename_here" - just replace that with the actual filename you want to use.  Unfortunately, this does preclude passing the filename in on the command line (unless you know bash better than I do - I'm not sure how to pass arguments to the "script" inside the bash).


Here's the latest perl script with additional comments that should help explain what it's doing...
 
#!/usr/local/bin/perl

use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr(OperationTimeout|(?i:invalid\s+signature));

# get the filename from the command line
my $fil = shift or die "Usage: $0 input_file\n";
# tie the input file using the Tie::File module
# this is the easiest way to manipulate a file in-place
tie my @file, 'Tie::File', $fil or die "could not tie $fil: $!";
# set starting line number to 0
my $start = 0;
# "skip" lines until we find a line containing [ERROR]
$start++ while ($file[$start] !~ m{\[ERROR\]});
# set end to the number of lines in the file
my $end = scalar @file;

while ($start < $end) {
    # debug statement just to make it easy to see which line the block it is
    # currently checking starts on
    print "checking line $start\n";
    ## find block and figure out if we should remove it
    # set $skip if the $start line matches our regex
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    # set number of lines in current block
    my $len = 1;
    while ($start+$len < $end) {
        # break out of loop if we found the next line with [ERROR]
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        # set $skip if the current line matches our regex
        $skip++ if ($file[$start+$len] =~ m{$rx});
        # increment the length of the current block
        $len++;
    }
    # set $skip if this is the last block and if it is < 5 lines
    $skip++ if ($start+$len >= $end and $len < 5);
    ## remove it if we should
    if ($skip) {
        # debug statement to make it easy to see the details of the block
        # being removed
        print "removing block at $start for $len\n";
        # do the actual removal
        # technically, splice replaces the block with the fourth argument
        # which is undef in this case (so replaces it with nothing)
        splice @file, $start, $len;
        # update our end-of-file line count to match the modified file
        $end = scalar @file;
    } else {
        ## add -- on previous line if not present
        # if the previous line isn't -- then...
        unless ($file[$start-1] =~ m{^--\s*$}) {
            # add a line containing --
            # technically, it replaces a "block" of 0 lines at the current
            # location with --
            splice @file, $start, 0, '--';
            # increment the length of the block to account for the added line
            $len++;
        }
        # advance our position by the length of the current block
        # only happens if we did not remove the current block
        $start += $len;
    }
}

Open in new window

0
 
tel2Commented:
Sorry for the delay with this, guys.  It's been night time on this side of the planet (NZ).

> The one gotcha is that it no longer works to call it as "parse.sh input_file"...
Here's a solution:
    perl - sample2.log <<'END_PERL'   # Or use "$1" instead of "sample2.log"

Wesly, pls note that 'END_PERL' can be almost any text, as long as it doesn't appear in your code on a line by itself.  If you want to indent your code, including the closing END_PERL, then prefix both with the same number of spaces, e.g.:
    perl - sample2.log <<'    END_PERL'
    END_PERL

Also, I expect (but this is a guess), that you can now get rid of the shebang line at the beginning of wilcoxon's code:
    #!/usr/local/bin/perl
If you want to specify a path for Perl, I guess you could do it like this:
    /usr/local/bin/perl - sample2.log <<'END_PERL'
0
 
wesly_chenAuthor Commented:
> perl - sample2.log <<'END_PERL'   # Or use "$1" instead of "sample2.log"
----
export SAMPLE_FILE='/tmp/sample.log'
perl - $SAMPLE_FILE << 'END_PERL'
----

This doesn't work.

Besides, I need to pass $EXCLUDE_STRING to
my $rx = qr(OperationTimeout|(?i:invalid\s+signature));
as something
my $rx = qr($ENV{EXCLUDE_STRING});

How to?
0
 
tel2Commented:
> This doesn't work.
A sage like yourself would probably know that comments like the above are not very helpful.  If something doesn't work, then tell us the version of code you ran, and what errors or problem you are getting, as this may save experts from having to waste time guessing.

Here's proof that it does work, depending on what Perl code you are using:
#!/bin/bash

# Wesly's method:
export SAMPLE_FILE='/tmp/sample.log'
perl - $SAMPLE_FILE << 'END_PERL'
print "The file is: " . (shift) . "\n\n";
END_PERL

# Or more simply, and including EXCLUDE_STRING:
export EXCLUDE_STRING=$1
perl - /tmp/sample.log <<'END_PERL'
print "Exclude String: $ENV{EXCLUDE_STRING}\n";
print "The file is: " . (shift) . "\n";
END_PERL

Open in new window

Put the above in a script (e.g. script.sh), then call it with an argument like this:
    script.sh "OperationTimeout|invalid signature"
and you should get this output:
    The file is: /tmp/sample.log
    Exclude String: OperationTimeout|invalid signature
    The file is: /tmp/sample.log
That's what I got when I ran it.

wilcoxon should be able to modify his code to handle the argument.
0
 
wilcoxonCommented:
Good idea.  I didn't think about grabbing bash vars from %ENV.  That should work for passing any data into the perl.
0
 
wilcoxonCommented:
So the "final" code (final given all ideas so far) should be...
#!/bin/bash

export EXCLUDE_STRING 'OperationTimeout|(?i:invalid\s+signature)'

perl - /input_file <<'END_PERL'
use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr($ENV{EXCLUDE_STRING});

my $fil = shift or die "Usage: perl - filename\n";
tie my @file, "Tie::File", $fil or die "could not tie input file $fil: $!";
my $start = 0;
$start++ while ($file[$start] !~ m{\[ERROR\]});
my $end = scalar @file;

while ($start < $end) {
    print "checking line $start\n";
    # find block and figure out if we should remove it
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    my $len = 1;
    while ($start+$len < $end) {
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        $skip++ if ($file[$start+$len] =~ m{$rx});
        $len++;
    }
    $skip++ if ($start+$len >= $end and $len < 5);
    # remove it if we should
    if ($skip) {
        print "removing block at $start for $len\n";
        splice @file, $start, $len;
        $end = scalar @file;
    } else {
        # add -- on previous line if not present
        unless ($file[$start-1] =~ m{^--\s*$}) {
            splice @file, $start, 0, "--";
            $len++;
        }
        # do not advance if we just removed a bunch of lines
        $start += $len;
    }
}
END_PERL

Open in new window

0
 
wesly_chenAuthor Commented:
Sorry about the comment.

Here is the message I got:
-----------#!/bin/bash -x
+ /usr/bin/perl - /tmp/sample.log
Use of uninitialized value in pattern match (m//) at - line 13, <$fh> line 2.
Use of uninitialized value in pattern match (m//) at - line 13, <$fh> line 2.
Use of uninitialized value in pattern match (m//) at - line 13, <$fh> line 2.
Use of uninitialized value in pattern match (m//) at - line 13, <$fh> line 2.
... (tons of lines and Ctrl-c to break)
---------

I use CentOS 5.x
bash, version 3.2.25(1)-release (x86_64-redhat-linux-gnu)
perl-5.8.8-32.el5_5.2

export SAMPLE_FILE='/tmp/sample.log'
/usr/bin/perl - $LOG_FILE_LOCAL <<'END_PERL'

use strict;
use warnings;
use Tie::File;

my $rx = qr(OperationTimeout|(?i:invalid\s+signature));
#my $rx = qr($ENV{EXCLUDE_STRING});

# tie my @file, "Tie::File", "$ENV{LOG_FILE_LOCAL}" or die "could not tie input file: $!";
my $fil = shift or die "Usage: $0 input_file\n";
tie my @file, 'Tie::File', $fil or die "could not tie $fil: $!";
my $start = 0;
$start++ while ($file[$start] !~ m{\[ERROR\]});
...

Open in new window

0
 
wesly_chenAuthor Commented:
Sorry, I pass
/usr/bin/perl - $SAMPLE_FILE <<'END_PERL'
0
 
tel2Commented:
Wesly, did you try my script.sh?

Have you tried wilcoxon's latest script?  You should.

If you can confirm that arguments like this:
    'OperationTimeout|invalid signature'
can be treated case sensitively, and will always contain the same number of spaces, then wilcoxon's script can be simplified.
0
 
wesly_chenAuthor Commented:
Great. Both codes in
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27033326.html?cid=1575#a35781811
and
http://www.experts-exchange.com/Programming/Languages/Scripting/Shell/Q_27033326.html?cid=1575#a35781906
(replaced line 3 with export EXCLUDE_STRING='Ope...)
work.

However, for some issue, I copied and pasted into my script and it keeps getting this error message
------
Use of uninitialized value in pattern match (m//) at - line 11, <$fh> line 2.
------
which point to this line of code
$start++ while ($file[$start] !~ m{\[ERROR\]});

I still can not figure out why it complains about "uninitialized value in pattern match (m//)"?
0
 
wesly_chenAuthor Commented:
OK, finally find the root cause for "uninitialized value in pattern match (m//)".
It is because that the file I pass to perl is empty, which is in real world there is no error message in the past 9 minutes.

This bring me the the question, how to add the check in the perl code when
sample.log is empty
or no
string?
0
 
wilcoxonCommented:
Here's my latest modified to exit when the file is empty.  I modified it so it should not error on an empty file nor when the file is non-empty but it does not include any lines with .
#!/bin/bash

export EXCLUDE_STRING 'OperationTimeout|(?i:invalid\s+signature)'

perl - /input_file <<'END_PERL'
use strict;
use warnings;
use Tie::File;

# setup regex here - make invalid signature case-insensitive
my $rx = qr($ENV{EXCLUDE_STRING});

my $fil = shift or die "Usage: perl - filename\n";
tie my @file, "Tie::File", $fil or die "could not tie input file $fil: $!";
exit unless @file;
my $start = 0;
my $end = scalar @file;
$start++ while ($start < $end and $file[$start] !~ m{\[ERROR\]});

while ($start < $end) {
    print "checking line $start\n";
    # find block and figure out if we should remove it
    my $skip = ($file[$start] =~ m{$rx}) ? 1 : 0;
    my $len = 1;
    while ($start+$len < $end) {
        last if ($file[$start+$len] =~ m{\[ERROR\]});
        $skip++ if ($file[$start+$len] =~ m{$rx});
        $len++;
    }
    $skip++ if ($start+$len >= $end and $len < 5);
    # remove it if we should
    if ($skip) {
        print "removing block at $start for $len\n";
        splice @file, $start, $len;
        $end = scalar @file;
    } else {
        # add -- on previous line if not present
        unless ($file[$start-1] =~ m{^--\s*$}) {
            splice @file, $start, 0, "--";
            $len++;
        }
        # do not advance if we just removed a bunch of lines
        $start += $len;
    }
}
END_PERL

Open in new window

0
 
wilcoxonCommented:
Sigh - too quick to submit...

Replace the line:

perl - /input_file <<'END_PERL'

with:

export LOG_FILE '/input_file' # or whatever filename you want
perl - $LOG_FILE <<'END_PERL'
0
 
wesly_chenAuthor Commented:
Yes, it works.
Excellent! Thanks for all your help!
0
 
tel2Commented:
> export LOG_FILE '/input_file' # or whatever filename you want
Make that:
    export LOG_FILE='/input_file' # or whatever filename you want


Hi Wesly,
What's your response to my last comment in my previous post, about arguments?
0
 
wesly_chenAuthor Commented:
@tel2
> If you can confirm that arguments like this:
>    'OperationTimeout|invalid signature'
I will pass the different $EXCLUDE_STRING for different application log checks.
Within Nagios, the string is better to simple and it is case sensitive and contain exact number of spaces.
0

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

  • 19
  • 10
  • 9
Tackle projects and never again get stuck behind a technical roadblock.
Join Now