?
Solved

Parse the "grep -A 4" output and strip out all 5 lines if contain keyword

Posted on 2011-05-05
12
Medium Priority
?
617 Views
Last Modified: 2012-06-27
I wrote a bash shell script and use
"grep -h A 4 tomcat_app.log"
to generate the output file, 5 lines in a group and separated by "--"

I would like to parse this output file
1. If any 5 lines group contains "OperationTimeout" or "invalid signature", then strip out this 5 lines group.
2. If the last group  contain less than 5 lines, then strip out.

So the final result for this sample code is:
--------------
2011-05-03 21:05:48,019 Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--


By the way, the keywords are in the second line in the 5-line group.

I using CentOS 5x with bash version 3.2-24  and perl version 5.8.8-32 (The latest version from CentOS 5x so far)
2011-05-03 21:05:48,019 [ERROR] Thread-6320:[/opt/opinmind/clogs/bidder-cweb/poster/error/577682209-1304456666114.csv] -- Failure occured in the component: auctionCweb-poster
java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
--
2011-05-04 17:58:01,756 [ERROR] http-8080-Processor7 -- Error handling bid request [adUnit=728x90,foldCount=0,cookieId=79270535,ipAddress=198.203.177.177,language=en,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,impressionGuid=prIrAIjJ9PRf,url=http://www.azcentral.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadvertently-flew-911-terrorist-dry-run-flight.html,referUrl=http://www.az.com/ent/celeb/articles/2011/05/04/20110504rob-lowe-inadv...,tagId=81462,userTzOffsetMinutes=null,userAgent=WINDOWS-FIREFOX,hashedVisitorGuid=9geiaUQQgnkv0SiMCsn_Pg,userVisitCount=13,webPageKeyWords=phoenix|news|flight|more|rob lowe|tv|celebrity|home|cars|dining|email|show|th...]
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:853)
        at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:868)
        at com.opinmind.common.cache.SpyMemcached.get(SpyMemcached.java:111)
--
2011-05-04 18:58:51,889 [ERROR] http-8080-Processor143 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 18:58:51,901 [ERROR] http-8080-Processor168 -- Error handling notify request
java.lang.RuntimeException: java.lang.RuntimeException: Invalid signature
        at com.opinmind.bidder.cweb.common.MacroEncryptionUtil.decrypt(MacroEncryptionUtil.java:68)
        at com.opinmind.bidder.cweb.dataobjects.CwebBidResponseNotifyInfoFactory.getNotifyInfo(CwebBidResponseNotifyInfoFactory.java:40)
        at com.opinmind.bidder.cweb.action.CwebBidResponseNotifyRequestActionImpl.handleNotifyRequest(CwebBidResponseNotifyRequestActionImpl.java:23)
--
2011-05-04 21:42:03,263 [ERROR] http-8080-Processor169 -- Error handling bid request [adUnit=160x600,foldCount=0,cookieId=108632106,ipAddress=76.95.84.45,language=en,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,impressionGuid=IKebWgxOZHdm,url=http://www.theybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-the-world-girls-video-preview,referUrl=http://www.ybf.com/2011/05/04/alicia-keys-performs-in-torontobeyonces-run-...,tagId=95135,userTzOffsetMinutes=null,userAgent=WINDOWS-IE,hashedVisitorGuid=UTqCkstu8K-BAKODxyNQ3w,userVisitCount=75,webPageKeyWords=e|video|baby|beyonce|love|world|toronto|alicia keys|more|basketball|concert|e...]

Open in new window

0
Comment
Question by:wesly_chen
  • 7
  • 5
12 Comments
 
LVL 12

Expert Comment

by:tel2
ID: 35703235
Hi wesly_chen,

Would you he happy with a Perl one-liner, or would you prefer a traditional Perl script?
0
 
LVL 38

Author Comment

by:wesly_chen
ID: 35703264
Perl one-liner
0
 
LVL 12

Accepted Solution

by:
tel2 earned 2000 total points
ID: 35703325
I'll accept that answer.

Here's yours:
    perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$p=$&;print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}' data.in

That assumes that your input file is called data.in, and is small enough to fit into RAM.
0
NEW Veeam Agent for Microsoft Windows

Backup and recover physical and cloud-based servers and workstations, as well as endpoint devices that belong to remote users. Avoid downtime and data loss quickly and easily for Windows-based physical or public cloud-based workloads!

 
LVL 38

Author Comment

by:wesly_chen
ID: 35703416
Could you explain about your perl script? Thanks.
0
 
LVL 38

Author Comment

by:wesly_chen
ID: 35703473
By the way, in bash shell, could I define
EXCLUDE_STRING="OperationTimeout|Invalid signature"
and pass to your perl script?
0
 
LVL 12

Expert Comment

by:tel2
ID: 35703553
Hi wesly,

Here's your explanation:

-0 = Set input record separator, so entire file is "slurped" in as one record.  Since there is nothing after the -0, it assumes null (ASCII 0), so as long as your data contains no nulls, this should work.  If it could contain nulls, change this to -0777, which will slurp any file in.

-n & -e: See perl -h for explaination.

while (/^(.*?)(\n--\n|\Z)/msg)
Loop through data, matching data upto the point where it matches either:
    <newline>--<newline>
or:
    end-of-data (\Z)

/msg
    http://perldoc.perl.org/perlre.html#Modifiers

{$p=$&;
    Store previous paragraph (group) in $p (change all $p to $g if you like).

print $p if $p !~ /(OperationTimeout|Invalid signature)/ and $p =~ /(\n.*){5}/}'
    Print the paragraph, if:
    - It doesn't match "OperationTimeout" or "Invalid signature"
and:
    - It contains at least 5 <newlines> followed by 0 or more chars.

data.in
    Input file


Regarding EXCLUDE_STRING, I don't know about it.  If you want to try that, feel free, (I'll leave it with you to do as I've spent enough time on this already), but there's no need, because my script already handles it, and there's no point in doing it in both places.
0
 
LVL 38

Author Comment

by:wesly_chen
ID: 35703585
Thanks for the explanation. Great.
The reason I would like to use $EXCLUDE_STRING is because my shell script will take this as argument
myscript.sh "OperationTimeout|Invalid signature"
---------
...
EXCLUDE_STRING=$1
perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$p=$&;print $p if $p !~ /($EXCLUDE_STRING)/ and $p =~ /(\n.*){5}/}' data.in
...
------------
However, your script uses single quote so I would like to know how to pass this argument into your script.
0
 
LVL 38

Author Comment

by:wesly_chen
ID: 35703592
In awk, I can pass external variable like
ask -v var=$EXCLUDE_STRING ' $1 ~ /var/ {print}'
Does perl have the similar optiom?
0
 
LVL 12

Expert Comment

by:tel2
ID: 35703641
Sorry Wesly, I misunderstood you.

You could do this:
    export EXCLUDE_STRING=$1
    perl -0ne 'while (/^(.*?)(\n--\n|\Z)/msg){$g=$&;print $g if $g !~ /$ENV{EXCLUDE_STRING}/ and $g =~ /(\n.*){5}/}' data.in

Don't forget the "export".
0
 
LVL 38

Author Closing Comment

by:wesly_chen
ID: 35703702
Great solution with detail explanation.
0
 
LVL 12

Expert Comment

by:tel2
ID: 35703996
That's OK, Wesly.

It's refreshing to get a question which has (almost) all the requirements specified up front, complete with sample data, so I can tell when my solution is working.
0
 
LVL 38

Author Comment

by:wesly_chen
ID: 35704222
Vague problem descriptions do not produce rapid resolutions! :-)
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Active Directory replication delay is the cause to many problems.  Here is a super easy script to force Active Directory replication to all sites with by using an elevated PowerShell command prompt, and a tool to verify your changes.
Google Drive is extremely cheap offsite storage, and it's even possible to get extra storage for free for two years.  You can use the free account 15GB, and if you have an Android device..when you install Google Drive for the first time it will give…
Learn how to navigate the file tree with the shell. Use pwd to print the current working directory: Use ls to list a directory's contents: Use cd to change to a new directory: Use wildcards instead of typing out long directory names: Use ../ to move…
Connecting to an Amazon Linux EC2 Instance from Windows Using PuTTY.
Suggested Courses

862 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question