wesly_chen
asked on
Parse the "grep -A 4" output and strip out all 5 lines if contain keyword
I wrote a bash shell script and use
"grep -h A 4 tomcat_app.log"
to generate the output file, 5 lines in a group and separated by "--"
I would like to parse this output file
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the last group contain less than 5 lines, then strip out.
So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind /clogs/bid der-cweb/p oster/erro r/57768220 9-13044566 66114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.s ocketConne ct(Native Method)
at java.net.PlainSocketImpl.d oConnect(P lainSocket Impl.java: 333)
at java.net.PlainSocketImpl.c onnectToAd dress(Plai nSocketImp l.java:195 )
--
By the way, the keywords are in the second line in the 5-line group.
I using CentOS 5x with bash version 3.2-24 and perl version 5.8.8-32 (The latest version from CentOS 5x so far)
"grep -h A 4 tomcat_app.log"
to generate the output file, 5 lines in a group and separated by "--"
I would like to parse this output file
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the last group contain less than 5 lines, then strip out.
So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind
java.net.ConnectException:
at java.net.PlainSocketImpl.s
at java.net.PlainSocketImpl.d
at java.net.PlainSocketImpl.c
--
By the way, the keywords are in the second line in the 5-line group.
I using CentOS 5x with bash version 3.2-24 and perl version 5.8.8-32 (The latest version from CentOS 5x so far)
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]
ASKER
Perl one-liner
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Could you explain about your perl script? Thanks.
ASKER
By the way, in bash shell, could I define
EXCLUDE_STRING="OperationT imeout|Inv alid signature"
and pass to your perl script?
EXCLUDE_STRING="OperationT
and pass to your perl script?
Hi wesly,
Here's your explanation:
-0 = Set input record separator, so entire file is "slurped" in as one record. Since there is nothing after the -0, it assumes null (ASCII 0), so as long as your data contains no nulls, this should work. If it could contain nulls, change this to -0777, which will slurp any file in.
-n & -e: See perl -h for explaination.
while (/^(.*?)(\n--\n|\Z)/msg)
Loop through data, matching data upto the point where it matches either:
<newline>--<newline>
or:
end-of-data (\Z)
/msg
http://perldoc.perl.org/perlre.html#Modifiers
{$p=$&;
Store previous paragraph (group) in $p (change all $p to $g if you like).
print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}'
Print the paragraph, if:
- It doesn't match "OperationTimeout" or "Invalid signature"
and:
- It contains at least 5 <newlines> followed by 0 or more chars.
data.in
Input file
Regarding EXCLUDE_STRING, I don't know about it. If you want to try that, feel free, (I'll leave it with you to do as I've spent enough time on this already), but there's no need, because my script already handles it, and there's no point in doing it in both places.
Here's your explanation:
-0 = Set input record separator, so entire file is "slurped" in as one record. Since there is nothing after the -0, it assumes null (ASCII 0), so as long as your data contains no nulls, this should work. If it could contain nulls, change this to -0777, which will slurp any file in.
-n & -e: See perl -h for explaination.
while (/^(.*?)(\n--\n|\Z)/msg)
Loop through data, matching data upto the point where it matches either:
<newline>--<newline>
or:
end-of-data (\Z)
/msg
http://perldoc.perl.org/perlre.html#Modifiers
{$p=$&;
Store previous paragraph (group) in $p (change all $p to $g if you like).
print $p if $p !~ /(OperationTimeout|Invalid
Print the paragraph, if:
- It doesn't match "OperationTimeout" or "Invalid signature"
and:
- It contains at least 5 <newlines> followed by 0 or more chars.
data.in
Input file
Regarding EXCLUDE_STRING, I don't know about it. If you want to try that, feel free, (I'll leave it with you to do as I've spent enough time on this already), but there's no need, because my script already handles it, and there's no point in doing it in both places.
ASKER
Thanks for the explanation. Great.
The reason I would like to use $EXCLUDE_STRING is because my shell script will take this as argument
myscript.sh "OperationTimeout|Invalid signature"
---------
...
EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$ p=$&;print $p if $p !~ /($EXCLUDE_STRING)/ and $p =~ /(\n.*){5}/}' data.in
...
------------
However, your script uses single quote so I would like to know how to pass this argument into your script.
The reason I would like to use $EXCLUDE_STRING is because my shell script will take this as argument
myscript.sh "OperationTimeout|Invalid signature"
---------
...
EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$
...
------------
However, your script uses single quote so I would like to know how to pass this argument into your script.
ASKER
In awk, I can pass external variable like
ask -v var=$EXCLUDE_STRING ' $1 ~ /var/ {print}'
Does perl have the similar optiom?
ask -v var=$EXCLUDE_STRING ' $1 ~ /var/ {print}'
Does perl have the similar optiom?
Sorry Wesly, I misunderstood you.
You could do this:
export EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$ g=$&;print $g if $g !~ /$ENV{EXCLUDE_STRING}/ and $g =~ /(\n.*){5}/}' data.in
Don't forget the "export".
You could do this:
export EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$
Don't forget the "export".
ASKER
Great solution with detail explanation.
That's OK, Wesly.
It's refreshing to get a question which has (almost) all the requirements specified up front, complete with sample data, so I can tell when my solution is working.
It's refreshing to get a question which has (almost) all the requirements specified up front, complete with sample data, so I can tell when my solution is working.
ASKER
Vague problem descriptions do not produce rapid resolutions! :-)
Would you he happy with a Perl one-liner, or would you prefer a traditional Perl script?