?
Solved

struggling to conquer a split/reg. expression issue with records in a flat file

Posted on 2003-03-11
11
Medium Priority
?
149 Views
Last Modified: 2010-03-05
Hi

I have a flat file database that contains records in the following format:

==================================================================
UNSUCCESSFUL POST!
Date/Time of post attempt: Tue Mar 11 11:34:41 2003
IP Address: 0.0.0.0
Browser: Mozilla/4.0 (compatible; MSIE 5.01;)
Error Message: Missing data. Please ensure you have entered a message.
==================================================================

The above is ONE single record, on mutiple lines of the file. Each record is written exactly as above on a new line each time. What I would like to do in my display script, is "massage" the display so that each record is displayed as fields on a single line, not multiple lines. Somewhat like below:

Message             Date/Time                  IP        Browser        Error
UNSUCCESSFUL POST!  Tue Mar 11 11:34:41 2003   0.0.0.0   Mozilla/4.0 (compatible; MSIE 5.01)

I have tried multiple ways to split on the newline character and cannot fathom it. For example:

                $row = $_; # this reads in the whole file
                chop $row;
                @fields = split (/\n/, $row); # trying to split on newline
                $fields[$i] =~ s/\n/ /g; # also tried substituting newline with whitespace.

Each time, it just displays the record on mutiple lines. I can change the file writing routines to write the records differently so that it is continuous records, delimited by a special character for instance, and I will do that if needs must but i'd like to be able to look at the flat file in a relatively pleasing format as well as display it for all my "users" like I stated above. If anyone could help out, that would be much apreciated.

Regards,
0
Comment
Question by:stevecamp
11 Comments
 
LVL 1

Expert Comment

by:chapatti
ID: 8113932
Hi Steve,

"chop" removes the newline character on line2 so split cannot produce the result you expect.

I haven't tried your example but if this doesn't fix it please let me know.

Cheers,

chapatti
0
 

Author Comment

by:stevecamp
ID: 8113958
Hi Chapatti

I gave that a try (cursing myself for forgetting that fact about chop!) but it made no difference. Any further help would be great.

Thanks again.
0
 
LVL 1

Expert Comment

by:chapatti
ID: 8114001
Sorry Steve,

I was not thinking clearly earlier...

The chop takes just one newline off at the end - and that should not be a problem.

Does this file have only one record in it?

Maybe try splitting the file into records along the multiple =s then remove the =s and try the splitting along the newlines again?

L8r,

chapatti
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Expert Comment

by:chapatti
ID: 8114071
Sorry Steve,

I was not thinking clearly earlier...

The chop takes just one newline off at the end - and that should not be a problem.

Does this file have only one record in it?

Maybe try splitting the file into records along the multiple =s then remove the =s and try the splitting along the newlines again?

L8r,

chapatti
0
 
LVL 85

Expert Comment

by:ozo
ID: 8114209
$_ = join'',<>;
while( /(\w.*)\s+Date\/Time of post attempt:(.*)\s+IP Address:(.*)\s+Browser:(.*)\s+Error Message:(.*)/g ){
    print $1,$2,$3,$4,$5,"\n";
}
0
 

Expert Comment

by:perlyhead
ID: 8115395
IF you could give me just a few sample entries exactly how they occur in your DB files and outline more clearly your overall objective I can help you.
 
Have you actually parsed out the field names?

Date/Time of post attempt:
IP Address:
Browser:
Error Message:

I'd like to write some code to process your files, if you just want. Yes, that's right, I do it for fun!!!

No guarantees...
0
 

Author Comment

by:stevecamp
ID: 8118740
Thanks for all your replies. Re: perlyhead - Each entry looks exaclty like the one above on a new line each time i.e

==================================================================
UNSUCCESSFUL POST!
Date/Time of post attempt: Tue Mar 11 11:34:41 2003
IP Address: 0.0.0.0
Browser: Mozilla/4.0 (compatible; MSIE 5.01;)
Error Message: Missing data. Please ensure you have entered a message.
==================================================================

==================================================================
UNSUCCESSFUL POST!
Date/Time of post attempt: Tue Mar 11 11:34:41 2003
IP Address: 0.0.0.0
Browser: Mozilla/4.0 (compatible; MSIE 5.01;)
Error Message: Missing data. Please ensure you have entered a message.
==================================================================

The only thing that would change is the "Error Message" portion, and then it's only the text that changes, basically it's a log file. So, I want to ignore the "=" character (already taken care of) and then I basically want to be able to split on newline so that when this file is displayed, they are all on one line like above. So in essence, several lines make up the record, and I want to display those lines on just one line. Here is a portion of my display code (I can post the entire script if it will help):

$total = 0;
$rowcount = 0;
$startVal = $in{'start'} - 1;
$endVal = $startVal + $numOfRecordsPerPage;
$ranking = $in{'start'};

####### Open the DB and format the data #######
open (DATABASE, "../../htdocs/wb_log_test");

while (<DATABASE>)
{
        if (($rowcount >= $startVal) && ($rowcount < $endVal)) {
                $row = $_;
                ##chop $row;
                @fields = split (/\n/, $row);
                ##$fields[$i] =~ s/  +/ /g;
                print"<tr>\n";
                for ($i = 0; $i < $numOfFields; $i++) {
                $fields[$i] =~ s/.*-/ /g;
                $fields[$i] =~ s/.*=/ /g;
                $fields[1] =~ s/.*t:/ /g;
                $fields[2] =~ s/.*:/ /g;
                $fields[3] =~ s/.*:/ /g;
                $fields[4] =~ s/.*:/ /g;
                print"<td align=middle><font size=$fontsize>$fields[$i]</font></
td>\n";
                }
                print"</tr>\n";
                $ranking++;
        }
        $rowcount++;
        $total++;
}
close(DATABASE);
print "</table>\n";

@fields is the array holding the file contents and I want to split it up from mutiple lines into one single line. Is that clearer?

Let me know if you need anything else.

Thanks again to all who responded.

0
 

Author Comment

by:stevecamp
ID: 8118776
Thanks for all your replies. Re: perlyhead - Each entry looks exaclty like the one above on a new line each time i.e

==================================================================
UNSUCCESSFUL POST!
Date/Time of post attempt: Tue Mar 11 11:34:41 2003
IP Address: 0.0.0.0
Browser: Mozilla/4.0 (compatible; MSIE 5.01;)
Error Message: Missing data. Please ensure you have entered a message.
==================================================================

==================================================================
UNSUCCESSFUL POST!
Date/Time of post attempt: Tue Mar 11 11:34:41 2003
IP Address: 0.0.0.0
Browser: Mozilla/4.0 (compatible; MSIE 5.01;)
Error Message: Missing data. Please ensure you have entered a message.
==================================================================

The only thing that would change is the "Error Message" portion, and then it's only the text that changes, basically it's a log file. So, I want to ignore the "=" character (already taken care of) and then I basically want to be able to split on newline so that when this file is displayed, they are all on one line like above. So in essence, several lines make up the record, and I want to display those lines on just one line. Here is a portion of my display code (I can post the entire script if it will help):

$total = 0;
$rowcount = 0;
$startVal = $in{'start'} - 1;
$endVal = $startVal + $numOfRecordsPerPage;
$ranking = $in{'start'};

####### Open the DB and format the data #######
open (DATABASE, "../../htdocs/wb_log_test");

while (<DATABASE>)
{
        if (($rowcount >= $startVal) && ($rowcount < $endVal)) {
                $row = $_;
                ##chop $row;
                @fields = split (/\n/, $row);
                ##$fields[$i] =~ s/  +/ /g;
                print"<tr>\n";
                for ($i = 0; $i < $numOfFields; $i++) {
                $fields[$i] =~ s/.*-/ /g;
                $fields[$i] =~ s/.*=/ /g;
                $fields[1] =~ s/.*t:/ /g;
                $fields[2] =~ s/.*:/ /g;
                $fields[3] =~ s/.*:/ /g;
                $fields[4] =~ s/.*:/ /g;
                print"<td align=middle><font size=$fontsize>$fields[$i]</font></
td>\n";
                }
                print"</tr>\n";
                $ranking++;
        }
        $rowcount++;
        $total++;
}
close(DATABASE);
print "</table>\n";

@fields is the array holding the file contents and I want to split it up from mutiple lines into one single line. Is that clearer?

Let me know if you need anything else.

Thanks again to all who responded.

0
 
LVL 1

Accepted Solution

by:
chapatti earned 200 total points
ID: 8121611
Hi stevecamp,

Try this:

print "Content-type: text/html\n\n";          

open (DATABASE, "wb_log_test");
@rows = <DATABASE>;
close(DATABASE);

$long_line = join("\n", @rows);
@records = split(/=+/, $long_line);

printf("<table border=\"1\" cellpadding=\"2\" cellspacing=\"0\" width=\"800\">\n");
printf("<tr>\n");
printf("<td><font face=\"arial\" size=\"1\">&nbsp;</font></td>\n");
printf("<td><font face=\"arial\" size=\"1\"><b>Date/Time of post attempt</b></font></td>\n");
printf("<td><font face=\"arial\" size=\"1\"><b>IP Address</b></font></td>\n");
printf("<td><font face=\"arial\" size=\"1\"><b>Browser</b></font></td>\n");
printf("<td><font face=\"arial\" size=\"1\"><b>Error Message</b></font></td>\n");
printf("</tr>\n");

foreach $_(@records){
     if($_ =~ /\w+/){
          ($hd, $rest1) = split(/Date\/Time of post attempt:/, $_);
          ($dt, $rest2) = split(/IP Address:/, $rest1);
          ($ip, $rest3) = split(/Browser:/, $rest2);
          ($br, $er) = split(/Error Message:/, $rest3);
          printf("<tr>\n");
          printf("<td><font face=\"arial\" size=\"1\">$hd</font></td>\n");
          printf("<td><font face=\"arial\" size=\"1\">$dt</font></td>\n");
          printf("<td><font face=\"arial\" size=\"1\">$ip</font></td>\n");
          printf("<td><font face=\"arial\" size=\"1\">$br</font></td>\n");
          printf("<td><font face=\"arial\" size=\"1\">$er</font></td>\n");
          printf("</tr>\n");
     }
}

BTW, ozo's solution might have the same result, have you tried it?

TTFN,

Tanja.
0
 

Author Comment

by:stevecamp
ID: 8135546
Tanja, that worked really well thank you. Thanks to all of you who responded.

Regards,

Stephen
0
 
LVL 1

Expert Comment

by:chapatti
ID: 8137167
anytime, thanks for the points!  :-)
0

Featured Post

Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

I've just discovered very important differences between Windows an Unix formats in Perl,at least 5.xx.. MOST IMPORTANT: Use Unix file format while saving Your script. otherwise it will have ^M s or smth likely weird in the EOL, Then DO NOT use m…
A year or so back I was asked to have a play with MongoDB; within half an hour I had downloaded (http://www.mongodb.org/downloads),  installed and started the daemon, and had a console window open. After an hour or two of playing at the command …
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

621 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question