Parse the "grep -A 4" output and strip out all 5 lines if contain keyword

I wrote a bash shell script and use
"grep -h A 4 tomcat_app.log"
to generate the output file, 5 lines in a group and separated by "--"

I would like to parse this output file
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the last group  contain less than 5 lines, then strip out.

So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--


By the way, the keywords are in the second line in the 5-line group.

I using CentOS 5x with bash version 3.2-24  and perl version 5.8.8-32 (The latest version from CentOS 5x so far)
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
        at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]

Open in new window

LVL 38
wesly_chenAsked:
Who is Participating?
 
tel2Commented:
I'll accept that answer.

Here's yours:
    perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$p=$&;print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}' data.in

That assumes that your input file is called data.in, and is small enough to fit into RAM.
0
 
tel2Commented:
Hi wesly_chen,

Would you he happy with a Perl one-liner, or would you prefer a traditional Perl script?
0
 
wesly_chenAuthor Commented:
Perl one-liner
0
Upgrade your Question Security!

Your question, your audience. Choose who sees your identity—and your question—with question security.

 
wesly_chenAuthor Commented:
Could you explain about your perl script? Thanks.
0
 
wesly_chenAuthor Commented:
By the way, in bash shell, could I define
EXCLUDE_STRING="OperationTimeout|Invalid signature"
and pass to your perl script?
0
 
tel2Commented:
Hi wesly,

Here's your explanation:

-0 = Set input record separator, so entire file is "slurped" in as one record.  Since there is nothing after the -0, it assumes null (ASCII 0), so as long as your data contains no nulls, this should work.  If it could contain nulls, change this to -0777, which will slurp any file in.

-n & -e: See perl -h for explaination.

while (/^(.*?)(\n--\n|\Z)/msg)
Loop through data, matching data upto the point where it matches either:
    <newline>--<newline>
or:
    end-of-data (\Z)

/msg
    http://perldoc.perl.org/perlre.html#Modifiers

{$p=$&;
    Store previous paragraph (group) in $p (change all $p to $g if you like).

print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}'
    Print the paragraph, if:
    - It doesn't match "OperationTimeout" or "Invalid signature"
and:
    - It contains at least 5 <newlines> followed by 0 or more chars.

data.in
    Input file


Regarding EXCLUDE_STRING, I don't know about it.  If you want to try that, feel free, (I'll leave it with you to do as I've spent enough time on this already), but there's no need, because my script already handles it, and there's no point in doing it in both places.
0
 
wesly_chenAuthor Commented:
Thanks for the explanation. Great.
The reason I would like to use $EXCLUDE_STRING is because my shell script will take this as argument
myscript.sh "OperationTimeout|Invalid signature"
---------
...
EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$p=$&;print $p if $p !~ /($EXCLUDE_STRING)/ and $p =~ /(\n.*){5}/}' data.in
...
------------
However, your script uses single quote so I would like to know how to pass this argument into your script.
0
 
wesly_chenAuthor Commented:
In awk, I can pass external variable like
ask -v var=$EXCLUDE_STRING ' $1 ~ /var/ {print}'
Does perl have the similar optiom?
0
 
tel2Commented:
Sorry Wesly, I misunderstood you.

You could do this:
    export EXCLUDE_STRING=$1
    perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$g=$&;print $g if $g !~ /$ENV{EXCLUDE_STRING}/ and $g =~ /(\n.*){5}/}' data.in

Don't forget the "export".
0
 
wesly_chenAuthor Commented:
Great solution with detail explanation.
0
 
tel2Commented:
That's OK, Wesly.

It's refreshing to get a question which has (almost) all the requirements specified up front, complete with sample data, so I can tell when my solution is working.
0
 
wesly_chenAuthor Commented:
Vague problem descriptions do not produce rapid resolutions! :-)
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.