Link to home
Start Free TrialLog in
Avatar of wesly_chen
wesly_chenFlag for United States of America

asked on

Parse the "grep -A 4" output and strip out all 5 lines if contain keyword

I wrote a bash shell script and use
"grep -h A 4 tomcat_app.log"
to generate the output file, 5 lines in a group and separated by "--"

I would like to parse this output file
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the last group  contain less than 5 lines, then strip out.

So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--


By the way, the keywords are in the second line in the 5-line group.

I using CentOS 5x with bash version 3.2-24  and perl version 5.8.8-32 (The latest version from CentOS 5x so far)
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
        at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]

Open in new window

Avatar of tel2
tel2
Flag of New Zealand image

Hi wesly_chen,

Would you he happy with a Perl one-liner, or would you prefer a traditional Perl script?
Avatar of wesly_chen

ASKER

Perl one-liner
ASKER CERTIFIED SOLUTION
Avatar of tel2
tel2
Flag of New Zealand image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Could you explain about your perl script? Thanks.
By the way, in bash shell, could I define
EXCLUDE_STRING="OperationTimeout|Invalid signature"
and pass to your perl script?
Hi wesly,

Here's your explanation:

-0 = Set input record separator, so entire file is "slurped" in as one record.  Since there is nothing after the -0, it assumes null (ASCII 0), so as long as your data contains no nulls, this should work.  If it could contain nulls, change this to -0777, which will slurp any file in.

-n & -e: See perl -h for explaination.

while (/^(.*?)(\n--\n|\Z)/msg)
Loop through data, matching data upto the point where it matches either:
    <newline>--<newline>
or:
    end-of-data (\Z)

/msg
    http://perldoc.perl.org/perlre.html#Modifiers

{$p=$&;
    Store previous paragraph (group) in $p (change all $p to $g if you like).

print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}'
    Print the paragraph, if:
    - It doesn't match "OperationTimeout" or "Invalid signature"
and:
    - It contains at least 5 <newlines> followed by 0 or more chars.

data.in
    Input file


Regarding EXCLUDE_STRING, I don't know about it.  If you want to try that, feel free, (I'll leave it with you to do as I've spent enough time on this already), but there's no need, because my script already handles it, and there's no point in doing it in both places.
Thanks for the explanation. Great.
The reason I would like to use $EXCLUDE_STRING is because my shell script will take this as argument
myscript.sh "OperationTimeout|Invalid signature"
---------
...
EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$p=$&;print $p if $p !~ /($EXCLUDE_STRING)/ and $p =~ /(\n.*){5}/}' data.in
...
------------
However, your script uses single quote so I would like to know how to pass this argument into your script.
In awk, I can pass external variable like
ask -v var=$EXCLUDE_STRING ' $1 ~ /var/ {print}'
Does perl have the similar optiom?
Sorry Wesly, I misunderstood you.

You could do this:
    export EXCLUDE_STRING=$1
    perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$g=$&;print $g if $g !~ /$ENV{EXCLUDE_STRING}/ and $g =~ /(\n.*){5}/}' data.in

Don't forget the "export".
Great solution with detail explanation.
That's OK, Wesly.

It's refreshing to get a question which has (almost) all the requirements specified up front, complete with sample data, so I can tell when my solution is working.
Vague problem descriptions do not produce rapid resolutions! :-)