Solved

How to search multiple criteria in a record using Perl regex?

Posted on 2011-02-20
13
1,019 Views
Last Modified: 2012-05-11
I would like to search multiple criteria in a record using Perl regular expression.
For example, I have the following record

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd


I would like to search all records that start with @text and the number line 1. The problem is that there are variable lines between the two criteria of @text and 1.

I started out with the following code but not working...

#!/usr/bin/perl

use strict;
use warnings;

# make it easy to change delimiters to whatever you want
my $delim = '|';

open FH, '<', 'scsi_test' or die $!;

$/='__Data__';


while(<FH>)
{

            if(/(^\@..*[^\n]*)\n(.*[^\n]*)\n(.*[^\n]*)\n(^1\..*[^\n]*)/ms)
            {
               print $1,$4"\n";
            }
}

Problem, its only printing out the first line and not going through the entire log file
0
Comment
Question by:areyouready344
  • 6
  • 5
  • 2
13 Comments
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34937892
Is there a __Data__ tag around every set of lines?  If not, one problem is you are slurping in more than one set with each go through the loop.

Will there always be two lines between?  You say "variable lines" but don't specify if that is variable content or variable number of lines (I'm assuming you mean the latter).
0
 

Author Comment

by:areyouready344
ID: 34937932
Yes wilcoxon, there will be variable lines between the criteria lines of ^@ and ^1\. , and yes, each data record now has __Data__
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34937948
If each set of lines now has __Data__, that makes it much easier.

Changing your regex to the below should work:

if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
0
Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

 

Author Comment

by:areyouready344
ID: 34938093
Tried it but it does not only the @text and 1. lines

it prints out everything...

#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;


$/='__Data__';


while(<FH>)
{

           if(/(^\@..*[^\n]*).*(^1\..*[^\n]*)/ms)
            {
               print $1,$2,"\n";
            }
}


The output is, it never filters.

@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd

__Data__
@test_scsi
passed
stop
1. ddkdkdkdkdkkdkdkd
2. dkdkdkdkdkdkdkd
3. dkdkdkdkdkdkdkd


Here's what I like the output to be like

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34938250
while ( <> )
{
	if ( /^\@test/ ) { $testLine = $_; }
	elsif ( /^1\./ )
	{
		print $testLine;
		print $_;
		print "\n";
	}
}

Open in new window


C:\temp>perl foo.pl foo.txt
@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

@test_scsi
1. ddkdkdkdkdkkdkdkd

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34938354
I missed that you had .* as well as [^\n]* - they are redundant (and the first is causing the problem).

if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)

will fix the regex and

print $1,"\n",$2,"\n\n";

will make the output match the format you want.
0
 

Author Comment

by:areyouready344
ID: 34938471
I hoping it would work with the input record separator as your solution works without using the input record separator.
0
 
LVL 26

Accepted Solution

by:
wilcoxon earned 500 total points
ID: 34938516
My last comment should work (tested) and still uses the record input separator.

I've included a full copy of the code below (rather than the previous comments on how to change it).
#!/usr/bin/perl

use strict;
use warnings;

open FH, '<', 'dd' or die $!;

$/='__Data__';

while(<FH>)
{
           if(/(^\@.[^\n]*).*(^1\.[^\n]*)/ms)
            {
               print $1,"\n",$2,"\n\n";
            }
}

Open in new window

0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34938522
Oops.  One more minor change.  It looks like there's an extra . in the regex (which shouldn't cause any issues) but it should be:

if(/(^\@[^\n]*).*(^1\.[^\n]*)/ms)
0
 
LVL 16

Expert Comment

by:sjklein42
ID: 34938565
I retract my solution - wilcoxon's is much better.
0
 

Author Comment

by:areyouready344
ID: 34939593
right on the momey Wilcoxon, how do you know this.... thanks for understanding and solving this issue. Now I know how to filter certain lines in a multiple line record. Now I can build any type of html table on any type of record line. This is powerful.

Thanks again...
0
 

Author Closing Comment

by:areyouready344
ID: 34939598
solution worked great, no problems.
0
 
LVL 26

Expert Comment

by:wilcoxon
ID: 34939689
It's just a matter of experience.  You'll get there someday if you keep programming in perl.
0

Featured Post

Free Tool: SSL Checker

Scans your site and returns information about your SSL implementation and certificate. Helpful for debugging and validating your SSL configuration.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
In the distant past (last year) I hacked together a little toy that would allow a couple of Manager types to query, preview, and extract data from a number of MongoDB instances, to their tool of choice: Excel (http://dilbert.com/strips/comic/2007-08…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question