perl script to modify log file

hpchong7
hpchong7 used Ask the Experts™
on
Dear all,

   I need a PERL script or shell script to do the following:
Read in /var/log/*trf*.log
Parse line by line (inside the log file each line has 5 numbers, each number separated by a space e.g 1 2 4 5 6)
If encounter a line (e.g line = N)with pattern "0 0 0 0" (four consecutive zero)
then modify the file in a way such that the numbers in line N-1 will have the same value as the numbers in line N-2, except the first number.
And of course, if N <=2 then nothing will be done. Thank you very much.

Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Dave CrossPerl programmer, author and trainer

Commented:
What code have you got so far? What problems are you having?

Commented:
@files = glob("/var/log/*trf*.log");
for (@files)
{
  open(IN, $_);
  @lines = <IN>;
  for($i = 0; $i < @lines; $i++)
  {
    if($lines[$i] =~ /0 0 0 0/ && $i && $i < $#lines)
    {
      @old = split / /, $lines[$i-1];
      @new = split / /, $lines[$i+1];
      $lines[$i-1] = join(' ', ($old[0], @new[1..$#new]));
    }
  }
  close(IN);
}

Commented:
I just realized that if your question is related to homework you are prohibited to use my solution!  Maybe that's why davorg did not just provide solution.  I might need to pay more attention.
Rowby Goren Makes an Impact on Screen and Online

Learn about longtime user Rowby Goren and his great contributions to the site. We explore his method for posing questions that are likely to yield a solution, and take a look at how his career transformed from a Hollywood writer to a website entrepreneur.

Author

Commented:
Dear inq123, no, it's not my homework of course. Just because I did not familar with perl so I post my question here and want to have a look on more code example. Thank you.

Commented:
Yeah I just noticed that your profile seemed that you must've graduated.  Glad it's not homework.

Anyhow, if you look for examples, I don't think the code above would be exemplary.  It'd be useful, but as davorg pointed out in another post that my habit of using @lines = <IN> reads whole file into memory so it'd choke on exceptionally large log files.  I think point well taken and in fact I myself don't even do it in my own code.  It's just that writing it is easier and for your case, it's also simpler not to have to have a variable for last line or managing read ahead.  And also the splitting and joining above could be better served simply by regular expression and string concat statements.
Because your trigger signal comes _after_ the lines that will need to be modified, you need an approach that can effectively look backwards rather than process the file line-by-line. Reading the whole file into memory works, if the file is not too large. The module Tie::File will work just as well, and will not read the entire file into memory (it will still have to read the entire file, of course).

#/usr/bin/env perl

use strict;
use Tie::File;

my @files = glob( "/var/log/*trf*.log");

for( @files) {
    tie my @array, 'Tie::File', $_ or
        do { warn 'Open failed on $_: $!' ; next; };
    for(my $i = 2; $i < scalar @array; ++$i) {
       next unless $array[$i] =~ /0 0 0 0/;
       my ($keep) = $array[$i-1] =~ /(\d+)/; # keep first number
       ($array[$i-1] = $array[$i-2]) =~ s/\d+/$keep/e;
       }
    untie @array;
    }

This would get more interesting if the 0 0 0 0 lines were supposed to be deleted after they've served their purpose. Using splice to delete these lines would throw off the indexing, since it works immediately. In that case, a sufficient fix would be to process the file backwards, just change the for loop control to be:

    for ( my $i = $#array; $i > 1; --$i) {

Since most file systems optimize sequential forward reading of files, this approach may be slightly slower.

       

Author

Commented:
Inq123:my file is large, max up to 1Mb, so your method may not be efficent.
jmcg: error in locating Tie::File, so I have to download it?

Author

Commented:
On the other hand, I just need to read in the first 200 lines of the file, apart from Tie::File, how can I modify the file on the fly?

Author

Commented:
jmcp: If I have some new requirement:
1.)only modify the first 300 lines of the file
2.)value not substitued by the previous line but the first line until "0 0 0 0" is no longer encountered.
e.g
5 1 2 3 4
5 0 0 0 0
5 0 0 0 0
5 9 8 7 6

then "1 2 3 4" will replaced by "9 8 7 6". How should I modify the code? Thanks.
ps: I've downloaded the Tie::File module.
You can change the loop limit from ($i < scalar @array) to 200 or 300, or whatever limit you require. If any of the changes require modifying the length of a record, it will still be necessary to read and rewrite the remainder of the file. These days, a file of 1Mb is not considered "large". Nearly all systems currently in use have enough RAM that a process can readily obtain sufficient memory to process  a file of this size.

Tie::File is now considered a standard part of Perl, but earlier distributions did not include it. It can be downloaded from CPAN:

    http://search.cpan.org/author/MJD/Tie-File-0.96/

With your new requirement, finding one line containing 0 0 0 0 is not sufficient for knowing the two index values of the lines you want to modify. Now Inq managed to intuit that you really meant index values of $i-1 and $i+1 when he first responded to you, while I interpreted your question to specify index values of $i-1 and $i-2. From your example, I see that Inq may have captured your real intent.

Here's how I would modify the loop so you can have one or more lines with 0 0 0 0 and change the line _before_ the 0 0 0 0 sequence based on the line immediately following the 0 0 0 0 sequence:

   my $target_index = undef;
   for(my $i = 1; $i < 300; ++$i) {
      last unless defined $array[$i + 1];
      if( $array[$i] =~ /0 0 0 0/) {
             $target_index = $i - 1 unless defined $target_index;
           } elsif ( defined $target_index ) {
              my ($keep) = $array[$target_index] =~ /(\d+)/; # keep first number on target line
              ($array[$target_index] = $array[$i]) =~ s/\d+/$keep/e;
              $target_index = undef;
           }
       }

You may not like what it does if line 299 contains 0 0 0 0.
Are you quite certain this isn't homework?

Commented:
1 MB is nothing as perl is not that inefficient in handling its arrays, so my program would work just fine and speed-wise it'll be faster than most other methods.  But then again, not reading whole file into memory is usually a good idea.  But even in that case, you don't have to use Tie or anything, just use a variable to remember the last line and the line before the last while you do processing file line-by-line, and you should also print those lines by a delay of two lines.

But yet again, Tie::File is probably better.

Author

Commented:
jmcg : I will be on vacation for 2 weeks and will evaluate your answer afterwards. Thanks!

Author

Commented:
BTW, may you explain the new code, I don't quite understand, e.g
last unless defined $array[$i + 1]; ????
$target_index = $i - 1 unless defined $target_index; ???
$target_index = $i - 1 unless defined $target_index; ???
so that when I come back I can have some ideas. Thanks.
ps: Once again I specify here I am not a student.
perldoc -f defined

will tell you about the defined operator. It returns true if its operand contains a value and returns false if it does not. A perl variable can be undefined because it has never before been used (it gets created on the spot unless you did 'use strict;'), because it has never been assigned anything, or because it has been explicity undefined.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial