Solved

PERL Expert Needed To Parse A Log file

Posted on 2012-04-02
8
394 Views
Last Modified: 2013-11-13
Hi,
   I need to parse a directory filled with log files.  These log files will have the block of text below or similar text in the log file with other output that is not needed.  This repeated block of text will be unique only due to a base64 encoded string, but each of the log files will have multiple of the same block of text in each file.  I need code that will parse my logs files, and ONLY EXTRACT the unique blocks of text once even though pattern matching may have happened  6 times in the same file.  Ideally, I'd like to see a print output of ONE block of text, and the number of times it's been matched for each unique block of text (there might be more than one per log file).    Sorry this is a bit nebulous, but I can't be more specific due to policy restrictions.  

Thanks in advance and here is the text.  This is just an example of course of my log file:

Unnecessary Text  blahblahblah blah

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique  

Unnecessary Text blahblahblahblah


Output might look like this:

Output for log file.1:
This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique

Matched this block: 5 times.

Thanks in advance.
0
Comment
Question by:unix_admin777
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
8 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 37794770
perl -ne 'BEGIN{$/="This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique"}END{print "Matched this block: ",$.-!chomp()," times\n"}'
0
 

Author Comment

by:unix_admin777
ID: 37794892
Can you please describe your solution in detail?  I've never seen the BEGIN and END block.  Also, I'm making this part of a larger PERL script so if you can break show an example of a full code block as well, that would be great.  Also, each separate log file will have a a number of these blocks of code with different base64 strings so I don't think your solution will work without a regex.  Thanks for the help though.
0
 
LVL 84

Expert Comment

by:ozo
ID: 37795303
What does the larger Perl script do?
What do the blocks with different base64 strings look like?
Can you give an example of the log files and what you want to do with them?
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:unix_admin777
ID: 37797026
I've attached a sample for your reference also.  Please note that the first 3 blocks of text that I want to match are the same, and the next two are different (they have different base64 strings).

Thanks.
log.txt
0
 

Author Comment

by:unix_admin777
ID: 37797188
Here is the expected output:

Log file 1 has the following:

First block match:

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  

This match was found: 3 times.

The Base64 string found in this match is: YTM0NZomIzI2OTsmIzM0NTueYQ==

Second block match:

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTMWZWEF@JIXzTWEEFSDXQWEff=.  

This match was found: 2 times

The Base64 string found in this match is: YTMWZWEF@JIXzTWEEFSDXQWEff=
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 37804791
while( <> ){
    $count{$_}++ if /The only unique value in this block of code will be a base64 encoded string:  ie [\w+\/@]+=+.  This will be unique/;
};
for( keys %count ){
    print "$_\nThis match was found $count{$_} times\n\nThe Base64 string found in this match is: ",/([\w+\/@]+=+)/,"\n\n";
}
0
 
LVL 53

Expert Comment

by:Dhaest
ID: 38249575
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

Free Tool: Site Down Detector

Helpful to verify reports of your own downtime, or to double check a downed website you are trying to access.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Does the idea of dealing with bits scare or confuse you? Does it seem like a waste of time in an age where we all have terabytes of storage? If so, you're missing out on one of the core tools in every professional programmer's toolbox. Learn how to …
In this post we will learn how to make Android Gesture Tutorial and give different functionality whenever a user Touch or Scroll android screen.
In this fourth video of the Xpdf series, we discuss and demonstrate the PDFinfo utility, which retrieves the contents of a PDF's Info Dictionary, as well as some other information, including the page count. We show how to isolate the page count in a…

733 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question