Solved

PERL Expert Needed To Parse A Log file

Posted on 2012-04-02
8
396 Views
Last Modified: 2013-11-13
Hi,
   I need to parse a directory filled with log files.  These log files will have the block of text below or similar text in the log file with other output that is not needed.  This repeated block of text will be unique only due to a base64 encoded string, but each of the log files will have multiple of the same block of text in each file.  I need code that will parse my logs files, and ONLY EXTRACT the unique blocks of text once even though pattern matching may have happened  6 times in the same file.  Ideally, I'd like to see a print output of ONE block of text, and the number of times it's been matched for each unique block of text (there might be more than one per log file).    Sorry this is a bit nebulous, but I can't be more specific due to policy restrictions.  

Thanks in advance and here is the text.  This is just an example of course of my log file:

Unnecessary Text  blahblahblah blah

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique  

Unnecessary Text blahblahblahblah


Output might look like this:

Output for log file.1:
This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique

Matched this block: 5 times.

Thanks in advance.
0
Comment
Question by:unix_admin777
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 3
  • 3
8 Comments
 
LVL 84

Expert Comment

by:ozo
ID: 37794770
perl -ne 'BEGIN{$/="This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  This will be unique"}END{print "Matched this block: ",$.-!chomp()," times\n"}'
0
 

Author Comment

by:unix_admin777
ID: 37794892
Can you please describe your solution in detail?  I've never seen the BEGIN and END block.  Also, I'm making this part of a larger PERL script so if you can break show an example of a full code block as well, that would be great.  Also, each separate log file will have a a number of these blocks of code with different base64 strings so I don't think your solution will work without a regex.  Thanks for the help though.
0
 
LVL 84

Expert Comment

by:ozo
ID: 37795303
What does the larger Perl script do?
What do the blocks with different base64 strings look like?
Can you give an example of the log files and what you want to do with them?
0
What Is Transaction Monitoring and who needs it?

Synthetic Transaction Monitoring that you need for the day to day, which ensures your business website keeps running optimally, and that there is no downtime to impact your customer experience.

 

Author Comment

by:unix_admin777
ID: 37797026
I've attached a sample for your reference also.  Please note that the first 3 blocks of text that I want to match are the same, and the next two are different (they have different base64 strings).

Thanks.
log.txt
0
 

Author Comment

by:unix_admin777
ID: 37797188
Here is the expected output:

Log file 1 has the following:

First block match:

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTM0NZomIzI2OTsmIzM0NTueYQ==.  

This match was found: 3 times.

The Base64 string found in this match is: YTM0NZomIzI2OTsmIzM0NTueYQ==

Second block match:

This is a block of test.  This block of text will be repeated over and over again in a log file that will have similar matches.  The only unique value in this block of code will be a base64 encoded string:  ie YTMWZWEF@JIXzTWEEFSDXQWEff=.  

This match was found: 2 times

The Base64 string found in this match is: YTMWZWEF@JIXzTWEEFSDXQWEff=
0
 
LVL 84

Accepted Solution

by:
ozo earned 500 total points
ID: 37804791
while( <> ){
    $count{$_}++ if /The only unique value in this block of code will be a base64 encoded string:  ie [\w+\/@]+=+.  This will be unique/;
};
for( keys %count ){
    print "$_\nThis match was found $count{$_} times\n\nThe Base64 string found in this match is: ",/([\w+\/@]+=+)/,"\n\n";
}
0
 
LVL 53

Expert Comment

by:Dhaest
ID: 38249575
This question has been classified as abandoned and is closed as part of the Cleanup Program. See the recommendation for more details.
0

Featured Post

Get 15 Days FREE Full-Featured Trial

Benefit from a mission critical IT monitoring with Monitis Premium or get it FREE for your entry level monitoring needs.
-Over 200,000 users
-More than 300,000 websites monitored
-Used in 197 countries
-Recommended by 98% of users

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This is an explanation of a simple data model to help parse a JSON feed
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Progress

691 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question