Link to home
Start Free TrialLog in
Avatar of nicky s
nicky sFlag for United States of America

asked on

Scripting

I want to a script that extracts the all the entries between start time and end time from a given input file to a new text file... the sample file contains the following entries..

161.169.73.129 - - [22/Feb/2010:09:41:16 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 617 - -        - - - -
161.169.73.129 - - [22/Feb/2010:09:50:17 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1409 - -        - - - -
161.169.73.129 - - [22/Feb/2010:09:55:18 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 593 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:20:19 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1292 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:23:20 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 642 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:24:21 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1277 - -        - - - -
96.17.166.164 - - [22/Feb/2010:10:26:22 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 685 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:26:22 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 456 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:28:23 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1276 - -        - - - -
72.246.16.69 - - [22/Feb/2010:10:30:23 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 499 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:32:24 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 561 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:36:25 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1304 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:38:26 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 603 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:40:27 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1358 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:40:28 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 607 - -        - - - -
161.169.73.129 - - [22/Feb/2010:10:41:29 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1335 - -        - - - -
161.169.73.129 - - [22/Feb/2010:15:42:30 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 719 - -        - - - -
161.169.73.129 - - [22/Feb/2010:15:44:31 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1284 - -        - - - -
80.67.75.14 - - [22/Feb/2010:15:10:32 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 678 - -        - - - -
161.169.73.129 - - [22/Feb/2010:15:15:32 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 448 - -        - - - -
204.2.159.188 - - [22/Feb/2010:15:20:32 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 493 - -        - - - -
161.169.73.129 - - [22/Feb/2010:15:25:33 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1281 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:32:34 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 607 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:38:35 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1433 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:40:36 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 591 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:54:37 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1234 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:02:38 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 591 - -        - - - -
161.169.73.129 - - [22/Feb/2010:16:09:39 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1273 - -        - - - -
161.169.73.129 - - [22/Feb/2010:17:17:40 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 604 - -        - - - -
96.17.166.165 - - [22/Feb/2010:17:22:40 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 662 - -        - - - -
161.169.73.129 - - [22/Feb/2010:17:26:42 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1403 - -        - - - -
161.169.73.129 - - [22/Feb/2010:17:35:42 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 602 - -        - - - -
72.246.194.165 - - [22/Feb/2010:17:39:42 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 499 - -        - - - -
161.169.73.129 - - [22/Feb/2010:17:40:43 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1304 - -        - - - -
161.169.73.129 - - [22/Feb/2010:17:40:44 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 615 - -        - - - -
161.169.73.129 - - [22/Feb/2010:18:05:45 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1555 - -        - - - -
72.247.123.46 - - [22/Feb/2010:18:10:45 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "FirstFlowAgent" /hc/hc.htm 629 - -        - - - -
161.169.73.129 - - [22/Feb/2010:18:15:46 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 605 - -        - - - -
161.169.73.129 - - [22/Feb/2010:18:20:47 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1242 - -        - - - -




For example i need the entries in between the time interval 09:50:16 to  16:39:36  


apache.txt
Avatar of simon3270
simon3270
Flag of United Kingdom of Great Britain and Northern Ireland image

sed -n '/:09:50:16/,/:16:39:36/p' apache.txt > time_region.txt
Avatar of nicky s

ASKER

simon3270:

can you please explain the above command
ASKER CERTIFIED SOLUTION
Avatar of simon3270
simon3270
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of nicky s

ASKER

hi this logic did not work for me ... as if i specify a range  /:09:50:16/,/:16:39:36/  the log should contain the starting and ending times of the range other wise the logic is not working ..

eg if

start time is 09:40:33

end time is 16:44:33

but if the access log does not contain the start time the results i get is zero








SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
The above link is OK, but you need to do some manipulation of the strings to get them to work (e.g. the log file has "Feb" for February).

If you save the code below as dateap.pl, and call it as follows, it does print only the lines between the specified dates.  The first argument to the script is the date and time that you want to start grabbing logs, while the second is the time you want to stop.  the third is the name of the log file.

perl dateap.pl 22/Feb/2010:09:43:16 22/Feb/2010:10:20:11 apache.txt
161.169.73.129 - - [22/Feb/2010:09:50:17 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 1409 - -        - - - -
161.169.73.129 - - [22/Feb/2010:09:55:18 +0000] "GET /hc/hc.htm HTTP/1.1" 200 296 "-" "-" /hc/hc.htm 593 - -        - - - -

I have used the same format for the limits as is in the log file, to make the code simpler.

This code requires that you use the full day/month/year to specify the start and end points - this allows for log files which cover multiple days, but does mean that you may have to be more specific than you expect.  If you don't want to have to specify the date (e.g. if the log only contains entries from one day), you could grab the date from the log file.  You would have a shell script wrapper around the perl script, and pass the start time (e.g. "09:43:16"), the end time (e.g. "10:20:11") and the log file name as the three parameters.  The script would then do:

dat=$(head -1 $3 | awk '{print substr($4,2,12)}')
perl dateap.pl ${dat}$1 ${dat}$2 $3
#!/usr/bin/perl --
use Time::Local;
%mon2num = qw(jan 1  feb 2  mar 3  apr 4  may 5  jun 6
              jul 7  aug 8  sep 9  oct 10 nov 11 dec 12);

sub cal_time {
        my $no = $_[0];
        my $yr=substr($no,7,4) - 1900;
        my $mnt=$mon2num{ lc substr($no,3,3) } - 1;
        my $dy=substr($no,0,2);
        my $hr=substr($no,12,2);
        my $min=substr($no,15,2);
        my $sec=substr($no,18,2);
        return timelocal($sec,$min,$hr,$dy,$mnt,$yr);
}

my $stim=cal_time($ARGV[0]);
my $etim=cal_time($ARGV[1]);
shift;shift;

while (<>)
{
        chop;
        @inl = split;
        my $tm1=cal_time( substr($inl[3],1,20) );
        if (($tm1 >= $stim) && ($tm1 <= $etim)) {
                printf("%s\n", $_);
        }
}

Open in new window