Solved

How to search multiple criteria in a record using Perl regex?

Posted on 2011-02-20
13
1,016 Views
Last Modified: 2012-05-11
I would like to search multiple criteria in a record using Perl regular expression.
For example, I have the following record

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd


I would like to search all records that start with @text and the number line 1. The problem is that there are variable lines between the two criteria of @text and 1.

I started out with the following code but not working...

#!/usr/bin/perl

use strict;
use warnings;

# make it easy to change delimiters to whatever you want
my $delim = '|';

open FH, '<', 'scsi_test' or die $!;

$/='__Data__';


while(<FH>)
{

            if(/(^\@..*[^\n]*)\n(.*[^\n]*)\n(.*[^\n]*)\n(^1\..*[^\n]*)/ms)
            {
               print $1,$4"\n";
            }
}

Problem, its only printing out the first line and not going through the entire log file
0
Comment
Question by:areyouready344
  • 6
  • 5
  • 2
13 Comments
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Is there a __Data__ tag around every set of lines?  If not, one problem is you are slurping in more than one set with each go through the loop.

Will there always be two lines between?  You say "variable lines" but don't specify if that is variable content or variable number of lines (I'm assuming you mean the latter).
0
 

Author Comment

by:areyouready344
Comment Utility
Yes wilcoxon, there will be variable lines between the criteria lines of ^@ and ^1\. , and yes, each data record now has __Data__
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
If each set of lines now has __Data__, that makes it much easier.

Changing your regex to the below should work:

if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
0
 

Author Comment

by:areyouready344
Comment Utility
Tried it but it does not only the @text and 1. lines

it prints out everything...

#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;


$/='__Data__';


while(<FH>)
{

           if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
            {
               print $1,$2,"\n";
            }
}


The output is, it never filters.

@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd


Here's what I like the output to be like

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd
0
 
LVL 16

Expert Comment

by:sjklein42
Comment Utility
while ( <> )
{
	if ( /^\@test/ ) { $testLine = $_; }
	elsif ( /^1\./ )
	{
		print $testLine;
		print $_;
		print "\n";
	}
}

Open in new window


C:\temp>perl foo.pl foo.txt
@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
I missed that you had .* as well as [^\n]* - they are redundant (and the first is causing the problem).

if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)

will fix the regex and

print $1,"\n",$2,"\n\n";

will make the output match the format you want.
0
What Should I Do With This Threat Intelligence?

Are you wondering if you actually need threat intelligence? The answer is yes. We explain the basics for creating useful threat intelligence.

 

Author Comment

by:areyouready344
Comment Utility
I hoping it would work with the input record separator as your solution works without using the input record separator.
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
Comment Utility
My last comment should work (tested) and still uses the record input separator.

I've included a full copy of the code below (rather than the previous comments on how to change it).
#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;

$/='__Data__';

while(<FH>)
{
           if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)
            {
               print $1,"\n",$2,"\n\n";
            }
}

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
Oops.  One more minor change.  It looks like there's an extra . in the regex (which shouldn't cause any issues) but it should be:

if(/(^\@[^\n]*).*(^1\.[^\n]*)/ms)
0
 
LVL 16

Expert Comment

by:sjklein42
Comment Utility
I retract my solution - wilcoxon's is much better.
0
 

Author Comment

by:areyouready344
Comment Utility
right on the momey Wilcoxon, how do you know this.... thanks for understanding and solving this issue. Now I know how to filter certain lines in a multiple line record. Now I can build any type of html table on any type of record line. This is powerful.

Thanks again...
0
 

Author Closing Comment

by:areyouready344
Comment Utility
solution worked great, no problems.
0
 
LVL 26

Expert Comment

by:wilcoxon
Comment Utility
It's just a matter of experience.  You'll get there someday if you keep programming in perl.
0

Featured Post

6 Surprising Benefits of Threat Intelligence

All sorts of threat intelligence is available on the web. Intelligence you can learn from, and use to anticipate and prepare for future attacks.

Join & Write a Comment

Suggested Solutions

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
When you create an app prototype with Adobe XD, you can insert system screens -- sharing or Control Center, for example -- with just a few clicks. This video shows you how. You can take the full course on Experts Exchange at http://bit.ly/XDcourse.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now