asked on

perl script to modify log file

Dear all,

I need a PERL script or shell script to do the following:
Read in /var/log/*trf*.log
Parse line by line (inside the log file each line has 5 numbers, each number separated by a space e.g 1 2 4 5 6)
If encounter a line (e.g line = N)with pattern "0 0 0 0" (four consecutive zero)
then modify the file in a way such that the numbers in line N-1 will have the same value as the numbers in line N-2, except the first number.
And of course, if N <=2 then nothing will be done. Thank you very much.

Dave Cross

What code have you got so far? What problems are you having?

inq123

@files = glob("/var/log/*trf*.log");
for (@files)
{
open(IN, $_);
@lines = <IN>;
for($i = 0; $i < @lines; $i++)
{
if($lines[$i] =~ /0 0 0 0/ && $i && $i < $#lines)
{
@old = split / /, $lines[$i-1];
@new = split / /, $lines[$i+1];
$lines[$i-1] = join(' ', ($old[0], @new[1..$#new]));
}
}
close(IN);
}

inq123

I just realized that if your question is related to homework you are prohibited to use my solution! Maybe that's why davorg did not just provide solution. I might need to pay more attention.

hpchong7

ASKER

Dear inq123, no, it's not my homework of course. Just because I did not familar with perl so I post my question here and want to have a look on more code example. Thank you.

inq123

Yeah I just noticed that your profile seemed that you must've graduated. Glad it's not homework.

Anyhow, if you look for examples, I don't think the code above would be exemplary. It'd be useful, but as davorg pointed out in another post that my habit of using @lines = <IN> reads whole file into memory so it'd choke on exceptionally large log files. I think point well taken and in fact I myself don't even do it in my own code. It's just that writing it is easier and for your case, it's also simpler not to have to have a variable for last line or managing read ahead. And also the splitting and joining above could be better served simply by regular expression and string concat statements.

jmcg

Because your trigger signal comes _after_ the lines that will need to be modified, you need an approach that can effectively look backwards rather than process the file line-by-line. Reading the whole file into memory works, if the file is not too large. The module Tie::File will work just as well, and will not read the entire file into memory (it will still have to read the entire file, of course).

#/usr/bin/env perl

use strict;
use Tie::File;

my @files = glob( "/var/log/*trf*.log");

for( @files) {
tie my @array, 'Tie::File', $_ or
do { warn 'Open failed on $_: $!' ; next; };
for(my $i = 2; $i < scalar @array; ++$i) {
next unless $array[$i] =~ /0 0 0 0/;
my ($keep) = $array[$i-1] =~ /(\d+)/; # keep first number
($array[$i-1] = $array[$i-2]) =~ s/\d+/$keep/e;
}
untie @array;
}

This would get more interesting if the 0 0 0 0 lines were supposed to be deleted after they've served their purpose. Using splice to delete these lines would throw off the indexing, since it works immediately. In that case, a sufficient fix would be to process the file backwards, just change the for loop control to be:

for ( my $i = $#array; $i > 1; --$i) {

Since most file systems optimize sequential forward reading of files, this approach may be slightly slower.

hpchong7

ASKER

Inq123:my file is large, max up to 1Mb, so your method may not be efficent.
jmcg: error in locating Tie::File, so I have to download it?

hpchong7

ASKER

On the other hand, I just need to read in the first 200 lines of the file, apart from Tie::File, how can I modify the file on the fly?

hpchong7

ASKER

jmcp: If I have some new requirement:
1.)only modify the first 300 lines of the file
2.)value not substitued by the previous line but the first line until "0 0 0 0" is no longer encountered.
e.g
5 1 2 3 4
5 0 0 0 0
5 0 0 0 0
5 9 8 7 6

then "1 2 3 4" will replaced by "9 8 7 6". How should I modify the code? Thanks.
ps: I've downloaded the Tie::File module.

ASKER CERTIFIED SOLUTION

jmcg

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

jmcg

Are you quite certain this isn't homework?

inq123

1 MB is nothing as perl is not that inefficient in handling its arrays, so my program would work just fine and speed-wise it'll be faster than most other methods. But then again, not reading whole file into memory is usually a good idea. But even in that case, you don't have to use Tie or anything, just use a variable to remember the last line and the line before the last while you do processing file line-by-line, and you should also print those lines by a delay of two lines.

But yet again, Tie::File is probably better.

hpchong7

ASKER

jmcg : I will be on vacation for 2 weeks and will evaluate your answer afterwards. Thanks!

hpchong7

ASKER

BTW, may you explain the new code, I don't quite understand, e.g
last unless defined $array[$i + 1]; ????
$target_index = $i - 1 unless defined $target_index; ???
$target_index = $i - 1 unless defined $target_index; ???
so that when I come back I can have some ideas. Thanks.
ps: Once again I specify here I am not a student.

jmcg

perldoc -f defined

will tell you about the defined operator. It returns true if its operand contains a value and returns false if it does not. A perl variable can be undefined because it has never before been used (it gets created on the spot unless you did 'use strict;'), because it has never been assigned anything, or because it has been explicity undefined.