Solved

How to search multiple criteria in a record using Perl regex?

Posted on 2011-02-20
13
1,020 Views
Last Modified: 2012-05-11
I would like to search multiple criteria in a record using Perl regular expression.
For example, I have the following record

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd


I would like to search all records that start with @text and the number line 1. The problem is that there are variable lines between the two criteria of @text and 1.

I started out with the following code but not working...

#!/usr/bin/perl

use strict;
use warnings;

# make it easy to change delimiters to whatever you want
my $delim = '|';

open FH, '<', 'scsi_test' or die $!;

$/='__Data__';


while(<FH>)
{

            if(/(^\@..*[^\n]*)\n(.*[^\n]*)\n(.*[^\n]*)\n(^1\..*[^\n]*)/ms)
            {
               print $1,$4"\n";
            }
}

Problem, its only printing out the first line and not going through the entire log file
0
Comment
Question by:areyouready344
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 2
13 Comments
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34937892
Is there a __Data__ tag around every set of lines?  If not, one problem is you are slurping in more than one set with each go through the loop.

Will there always be two lines between?  You say "variable lines" but don't specify if that is variable content or variable number of lines (I'm assuming you mean the latter).
0
 

Author Comment

by:areyouready344
ID: 34937932
Yes wilcoxon, there will be variable lines between the criteria lines of ^@ and ^1\. , and yes, each data record now has __Data__
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34937948
If each set of lines now has __Data__, that makes it much easier.

Changing your regex to the below should work:

if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:areyouready344
ID: 34938093
Tried it but it does not only the @text and 1. lines

it prints out everything...

#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;


$/='__Data__';


while(<FH>)
{

           if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
            {
               print $1,$2,"\n";
            }
}


The output is, it never filters.

@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd


Here's what I like the output to be like

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34938250
while ( <> )
{
	if ( /^\@test/ ) { $testLine = $_; }
	elsif ( /^1\./ )
	{
		print $testLine;
		print $_;
		print "\n";
	}
}

Open in new window


C:\temp>perl foo.pl foo.txt
@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34938354
I missed that you had .* as well as [^\n]* - they are redundant (and the first is causing the problem).

if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)

will fix the regex and

print $1,"\n",$2,"\n\n";

will make the output match the format you want.
0
 

Author Comment

by:areyouready344
ID: 34938471
I hoping it would work with the input record separator as your solution works without using the input record separator.
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
ID: 34938516
My last comment should work (tested) and still uses the record input separator.

I've included a full copy of the code below (rather than the previous comments on how to change it).
#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;

$/='__Data__';

while(<FH>)
{
           if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)
            {
               print $1,"\n",$2,"\n\n";
            }
}

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34938522
Oops.  One more minor change.  It looks like there's an extra . in the regex (which shouldn't cause any issues) but it should be:

if(/(^\@[^\n]*).*(^1\.[^\n]*)/ms)
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34938565
I retract my solution - wilcoxon's is much better.
0
 

Author Comment

by:areyouready344
ID: 34939593
right on the momey Wilcoxon, how do you know this.... thanks for understanding and solving this issue. Now I know how to filter certain lines in a multiple line record. Now I can build any type of html table on any type of record line. This is powerful.

Thanks again...
0
 

Author Closing Comment

by:areyouready344
ID: 34939598
solution worked great, no problems.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34939689
It's just a matter of experience.  You'll get there someday if you keep programming in perl.
0

Featured Post

Technology Partners: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

752 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question