# Compare certain lines and determine the difference in time

Posted on 2004-10-22
Last Modified: 2010-03-05

I have a large file that I need to periodically grep though for certain field. To ensure the transactions are occurring in a timely matter.

The begging of a transaction is marked with “req  > 210 :”    and the end of a transaction is marked with “** Status  :”   The key to line up the correct beginning transaction to the correct ending transaction is random number, is maintains the same number through the transaction.  Ie

[18/Aug/2004:11:45:00 -0500][24-I] req  > 210 :
[18/Aug/2004:11:45:03 -0500][24-I] ** Status  :

The key here would be 24-I.  There would be other transaction lines in between the these two lines. Because the transaction number may be reused I need to match the first pair, than find the difference in seconds (this case 3) and do X if greater then 1 second.

Any thoughts
MatthewF
22 Comments

Expert Comment

use Time::Local 'timegm_nocheck';
@M{qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec)}=(0..11);
while( <> ){
\$start{\$10} = timegm_nocheck \$6,\$5+(\$7.\$9),\$4+(\$7.\$8),\$1,\$M{\$2},\$3 if /\[(\d+)\/(\w+)\/(\d+):(\d+):(\d+):(\d+)\s*([+-]?)(\d\d)(\d\d)\]\[(.*?)\]\s+req\s+>\s+210\s+:/ ;
if( /\[(\d+)\/(\w+)\/(\d+):(\d+):(\d+):(\d+)\s*([+-]?)(\d\d)(\d\d)\]\[(.*?)\]\s+\*\*\s+Status\s+:/
&& \$start{\$10}+1 < timegm_nocheck \$6,\$5+(\$7.\$9),\$4+(\$7.\$8),\$1,\$M{\$2},\$3
){
&X;
}
}

Author Comment

Im a bit confused so im my input file (called log.ot) looked like:

I see the sub for doing x,  where would I tail the file to two diffrect arrays or would i?

This would be @start array
[18/Aug/2004:11:45:00 -0500][24-I] req  > 210 :
[18/Aug/2004:11:45:01 -0500][27-I] req  > 210 :
[18/Aug/2004:11:45:02 -0500][28-I] req  > 210 :
[18/Aug/2004:11:45:03 -0500][29-I] req  > 210 :

This would be @finish array
[18/Aug/2004:11:45:03 -0500][24-I] ** Status  :
[18/Aug/2004:11:45:03 -0500][27-I] ** Status  :
[18/Aug/2004:11:45:03 -0500][28-I] ** Status  :
[18/Aug/2004:11:45:03 -0500][29-I] ** Status  :

Author Comment

I put the following in

\$all_host="log.ot";
open(HOST,"<\$all_host") || die "Can't open infile: \$!\n";
while( <HOST> ){

and x looks good.  I gther the one second delay is from :

&& \$start{\$10}+1 < timegm_nocheck \$6,\$5+(\$7.\$9),\$4+(\$7.\$8),\$1,\$M{\$2},\$3
the plus one correct?
Expert Comment

There is only a %start hash, which is filled when it sees " req  > 210 :"
and examined when it sees " ** Status  :"

You would invoke the program as:
perl script.perl < log.ot
Expert Comment

Yes, the 1 second delay is from the +1
Author Comment

last question. If the seconds are greater then one, how could I print out the original status: lie

[18/Aug/2004:11:45:03 -0500][24-I] ** Status
Expert Comment

the +(\$7.\$9) and +(\$7.\$8) might be -(\$7.\$9) and -(\$7.\$8)
I forget whether the -0500 was an offset from GMT or to GMT
Expert Comment

print \$_;
Expert Comment

#or just
print;
Author Comment

When I did a print on sub x I get

sub x
print "time \$6,\$5+(\$7.\$9),\$4+(\$7.\$8) is  out of spec\n";

The return I get is:

time 47,48+(-.00),08+(-.05) is  out of spec

I was wanting to get the initial line, is thta possible?
Expert Comment

sub x {
print "time ",/\[(.*?)\]/," is out of spec\n";
}
Author Comment

If this is too difficult let me know. But I want to do was to compare a req  > 210 :with the first Status line with the same transaction number. It may be the case there is not a corresponding Status line and if not do nothing

[18/Aug/2004:11:44:50 -0500][57-I] req  > 210 :      REQ
[18/Aug/2004:11:44:50 -0500][23-I] req  > 210 :
[18/Aug/2004:11:44:51 -0500][31-I] req  > 210 :
[18/Aug/2004:11:44:53 -0500][24-I] ** Status  :.
[18/Aug/2004:11:44:56 -0500][24-I] ** Status  : '.
[18/Aug/2004:11:44:56 -0500][27-I] ** Status  : .
[18/Aug/2004:11:44:56 -0500][65-I] ** Status  : '.
[18/Aug/2004:11:44:56 -0500][24-I] ** Status  : '.
[18/Aug/2004:11:44:56 -0500][23-I] ** Status  : .
[18/Aug/2004:11:44:58 -0500][24-I] ** Status  : '
[18/Aug/2004:11:44:58 -0500][31-I] ** Status  : '.
[18/Aug/2004:11:44:58 -0500][57-I] ** Status  : '.   Correspoding Status line
[18/Aug/2004:11:44:58 -0500][27-I] ** Status  : .
[18/Aug/2004:11:45:00 -0500][23-I] req  > 210 :
[18/Aug/2004:11:45:00 -0500][27-I] req  > 210 :
[18/Aug/2004:11:45:00 -0500][57-I] ** Status  :
[18/Aug/2004:11:45:00 -0500][24-I] req  > 210 :
[18/Aug/2004:11:45:03 -0500][23-I] ** Status  :
[18/Aug/2004:11:45:05 -0500][57-I] ** Status  :
[18/Aug/2004:11:45:05 -0500][24-I] ** Status  :
[18/Aug/2004:11:45:05 -0500][65-I] ** Status  :
[18/Aug/2004:11:45:06 -0500][24-I] ** Status  :
[18/Aug/2004:11:45:06 -0500][31-I] ** Status  :
[18/Aug/2004:11:45:06 -0500][27-I] ** Status  :
[18/Aug/2004:11:45:06 -0500][23-I] ** Status  :
[18/Aug/2004:11:45:07 -0500][57-I] ** Status  :
[18/Aug/2004:11:45:08 -0500][23-I] req  > 210 : '
Expert Comment

The program I gave should do exactly that.
Are you encountering a problem?
Author Comment

When I run the script against the above input I would except no more the 7 liens of ouput form sub x becuase there are only 7 lines with req  > 210 :  However I receieved 20 "out of spec lines"
Author Comment

I think part of the issue is the compare need to stop on the first match

[18/Aug/2004:11:44:50 -0500][23-I] req  > 210 :
[18/Aug/2004:11:44:51 -0500][23-I] ** Status  :     Stop here..........
[18/Aug/2004:11:45:03 -0500][23-I] ** Status  :
Expert Comment

Sorry, I didn't realize a second
[23-I] ** Status  :
line could occur without a second
[23-I] req  > 210 :
line

if( /\[(\d+)\/(\w+)\/(\d+):(\d+):(\d+):(\d+)\s*([+-]?)(\d\d)(\d\d)\]\[(.*?)\]\s+\*\*\s+Status\s+:/
&& \$start{\$10} && \$start{\$10}+1 < timegm_nocheck \$6,\$5+(\$7.\$9),\$4+(\$7.\$8),\$1,\$M{\$2},\$3
){
&x;
delete \$start{\$10};
}
}
Author Comment

I still get mutiole out  I hope this input with comments may help

[18/Aug/2004:11:44:50 -0500][57-I] req  > 210 :      REQ
[18/Aug/2004:11:44:50 -0500][23-I] req  > 210 :
[18/Aug/2004:11:44:51 -0500][31-I] req  > 210 :
[18/Aug/2004:11:44:53 -0500][24-I] ** Status  :.   no prior reg > 210  for 24-I ..skip
[18/Aug/2004:11:44:56 -0500][24-I] ** Status  : '.  no prior reg > 210  for 24-I ..skip
[18/Aug/2004:11:44:56 -0500][23-I] ** Status  : .   Correspoding Status line  for 23-I   ---Matched gt then 1
[18/Aug/2004:11:44:58 -0500][31-I] ** Status  : '.  Correspoding Status line    for 31 -I ----Matched gt then 1
[18/Aug/2004:11:44:58 -0500][57-I] ** Status  : '.   Correspoding Status line  for 57 -I ---Matched gt then 1
[18/Aug/2004:11:44:58 -0500][57-I] ** Status  : .    matched already,,skip

[
Author Comment

When I ran the script I get the following output from sub x. Note the the  print "\$start{\$10}\n\n" works on the three correct matches and the other three lines print only \$_.  I would have not espected the lines with only \$_  not to be printed

sub x {
print "\$_";
print "\$start{\$10}\n\n";
}

[18/Aug/2004:11:44:53 -0500][24-I] ** Status :. no prior reg > 210 for 24-I ..skip

[18/Aug/2004:11:44:56 -0500][24-I] ** Status : '. no prior reg > 210 for 24-I ..skip

[18/Aug/2004:11:44:56 -0500][23-I] ** Status : . Correspoding Status line for 23-I ---Matched gt then 1
1092811490

[18/Aug/2004:11:44:58 -0500][31-I] ** Status : '. Correspoding Status line for 31 -I ----Matched gt then 1
1092811491

[18/Aug/2004:11:44:58 -0500][57-I] ** Status : '. Correspoding Status line for 57 -I ---Matched gt then 1
1092811490

[18/Aug/2004:11:44:58 -0500][57-I] ** Status : . matched already,,skip
Expert Comment

Did you include the
\$start{\$10} &&
?
Author Comment

yes, and to work around that I added this to my sub

if  (\$start{\$10} > 0)
{
print "\$_";
delete \$start{\$10};
}}
Accepted Solution

ozo earned 2000 total points
I had that in the condition before the sub

if( /\[(\d+)\/(\w+)\/(\d+):(\d+):(\d+):(\d+)\s*([+-]?)(\d\d)(\d\d)\]\[(.*?)\]\s+\*\*\s+Status\s+:/
&& \$start{\$10} && \$start{\$10}+1 < timegm_nocheck \$6,\$5+(\$7.\$9),\$4+(\$7.\$8),\$1,\$M{\$2},\$3
){
Author Comment

Opps.... typo on my part... Thanks so much ozo.. !!!!
#### 627 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.