Link to home
Start Free TrialLog in
Avatar of hpchong7
hpchong7

asked on

perl script to modify log file

Dear all,

   I need a PERL script or shell script to do the following:
Read in /var/log/*trf*.log
Parse line by line (inside the log file each line has 5 numbers, each number separated by a space e.g 1 2 4 5 6)
If encounter a line (e.g line = N)with pattern "0 0 0 0" (four consecutive zero)
then modify the file in a way such that the numbers in line N-1 will have the same value as the numbers in line N-2, except the first number.
And of course, if N <=2 then nothing will be done. Thank you very much.

Avatar of Dave Cross
Dave Cross
Flag of United Kingdom of Great Britain and Northern Ireland image

What code have you got so far? What problems are you having?
Avatar of inq123
inq123

@files = glob("/var/log/*trf*.log");
for (@files)
{
  open(IN, $_);
  @lines = <IN>;
  for($i = 0; $i < @lines; $i++)
  {
    if($lines[$i] =~ /0 0 0 0/ && $i && $i < $#lines)
    {
      @old = split / /, $lines[$i-1];
      @new = split / /, $lines[$i+1];
      $lines[$i-1] = join(' ', ($old[0], @new[1..$#new]));
    }
  }
  close(IN);
}
I just realized that if your question is related to homework you are prohibited to use my solution!  Maybe that's why davorg did not just provide solution.  I might need to pay more attention.
Avatar of hpchong7

ASKER

Dear inq123, no, it's not my homework of course. Just because I did not familar with perl so I post my question here and want to have a look on more code example. Thank you.
Yeah I just noticed that your profile seemed that you must've graduated.  Glad it's not homework.

Anyhow, if you look for examples, I don't think the code above would be exemplary.  It'd be useful, but as davorg pointed out in another post that my habit of using @lines = <IN> reads whole file into memory so it'd choke on exceptionally large log files.  I think point well taken and in fact I myself don't even do it in my own code.  It's just that writing it is easier and for your case, it's also simpler not to have to have a variable for last line or managing read ahead.  And also the splitting and joining above could be better served simply by regular expression and string concat statements.
Because your trigger signal comes _after_ the lines that will need to be modified, you need an approach that can effectively look backwards rather than process the file line-by-line. Reading the whole file into memory works, if the file is not too large. The module Tie::File will work just as well, and will not read the entire file into memory (it will still have to read the entire file, of course).

#/usr/bin/env perl

use strict;
use Tie::File;

my @files = glob( "/var/log/*trf*.log");

for( @files) {
    tie my @array, 'Tie::File', $_ or
        do { warn 'Open failed on $_: $!' ; next; };
    for(my $i = 2; $i < scalar @array; ++$i) {
       next unless $array[$i] =~ /0 0 0 0/;
       my ($keep) = $array[$i-1] =~ /(\d+)/; # keep first number
       ($array[$i-1] = $array[$i-2]) =~ s/\d+/$keep/e;
       }
    untie @array;
    }

This would get more interesting if the 0 0 0 0 lines were supposed to be deleted after they've served their purpose. Using splice to delete these lines would throw off the indexing, since it works immediately. In that case, a sufficient fix would be to process the file backwards, just change the for loop control to be:

    for ( my $i = $#array; $i > 1; --$i) {

Since most file systems optimize sequential forward reading of files, this approach may be slightly slower.

       
Inq123:my file is large, max up to 1Mb, so your method may not be efficent.
jmcg: error in locating Tie::File, so I have to download it?
On the other hand, I just need to read in the first 200 lines of the file, apart from Tie::File, how can I modify the file on the fly?
jmcp: If I have some new requirement:
1.)only modify the first 300 lines of the file
2.)value not substitued by the previous line but the first line until "0 0 0 0" is no longer encountered.
e.g
5 1 2 3 4
5 0 0 0 0
5 0 0 0 0
5 9 8 7 6

then "1 2 3 4" will replaced by "9 8 7 6". How should I modify the code? Thanks.
ps: I've downloaded the Tie::File module.
ASKER CERTIFIED SOLUTION
Avatar of jmcg
jmcg
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Are you quite certain this isn't homework?
1 MB is nothing as perl is not that inefficient in handling its arrays, so my program would work just fine and speed-wise it'll be faster than most other methods.  But then again, not reading whole file into memory is usually a good idea.  But even in that case, you don't have to use Tie or anything, just use a variable to remember the last line and the line before the last while you do processing file line-by-line, and you should also print those lines by a delay of two lines.

But yet again, Tie::File is probably better.
jmcg : I will be on vacation for 2 weeks and will evaluate your answer afterwards. Thanks!
BTW, may you explain the new code, I don't quite understand, e.g
last unless defined $array[$i + 1]; ????
$target_index = $i - 1 unless defined $target_index; ???
$target_index = $i - 1 unless defined $target_index; ???
so that when I come back I can have some ideas. Thanks.
ps: Once again I specify here I am not a student.
perldoc -f defined

will tell you about the defined operator. It returns true if its operand contains a value and returns false if it does not. A perl variable can be undefined because it has never before been used (it gets created on the spot unless you did 'use strict;'), because it has never been assigned anything, or because it has been explicity undefined.