[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

delete line from file based on token criteria

Posted on 2005-05-17
10
Medium Priority
?
325 Views
Last Modified: 2013-12-03
I am trying to delete certain lines from my web server logs based on a certain criteria.

Here is a sample line from the logs:

2005-05-17 03:59:59 GET /applications/pwrdesk/templates/images/buttons/billing.gif - 443 - 68.163.145.40 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.5;+Windows+95) - https://avc01.onceanddone.com/applications/pwrdesk/pwrdesk_socket.pl?policy_number=1356310&policy_year=2004&page=iapw_p1a.html avc01.onceanddone.com 200 851 480 0

What I need to do is delete all lines that contain .gif or .jpg, unless they contain the status code of 40x (400, 401, 402, etc...) or 50x (500, 501, 502, etc...) the status code is the 4th from last token in the example. (in the example, the status code is 200).

The token that will contain the .gif or .jpg is the 14th. (dashes in the above example count as tokens.

Is there a way to do this?

0
Comment
Question by:boucherc
  • 6
  • 4
10 Comments
 

Author Comment

by:boucherc
ID: 14020977
Actually, the token that contains the .gif or .jpg is the 4th. The token that contains the status code is the 14th.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 14092119
This is an easy task if you want to use a Perl script.

I can write a quick and dirty script that has minimal error handling or with a little more effort, I can add in the proper file locking and error handling.  However, it’s not clear if the 2 tokens that you’re interested in will always be the 4th and 14th.
0
 

Author Comment

by:boucherc
ID: 14093445
I'd go for the quick and dirty. And the 2 tokens would always be 4th and 14th. If a line has one specific token blank, it's substituted with a dash. ("-") as token #5 is in the above example.
0
Configuration Guide and Best Practices

Read the guide to learn how to orchestrate Data ONTAP, create application-consistent backups and enable fast recovery from NetApp storage snapshots. Version 9.5 also contains performance and scalability enhancements to meet the needs of the largest enterprise environments.

 
LVL 28

Expert Comment

by:FishMonger
ID: 14095664
>> If a line has one specific token blank, it's substituted with a dash.
What delimiter separates the tokens?  From the example line I’d say it’s a space, so if more than a single space would that constitute an empty token?
0
 

Author Comment

by:boucherc
ID: 14095722
no, the tokens are space delimited. If all tokens were blank, it would look like:
- - - - - - - - - - - - - - - - -

so, if even tokens were blank, it would look like:

1 - 3 - 5 - 7 - 9 - 11 - 13 - 15 - 17
0
 

Author Comment

by:boucherc
ID: 14095733
There actually shouldn't be more than a single space separating tokens.
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 14098785
#!perl -w

use strict;
use Tie::File;

my $log_file = 'C:/Program Files/Apache Group/logs/access.log';

# This next line will tie (link) a Perl array to the log file which means,
# when you modify the array, it's actually modifiing the file.
tie my @log_file, 'Tie:File', $log_file or die "Could not tie the array to the log file $log_file $!";

for my $i (0..$#log_file) {
   my @tokens = split /\s/, $log_file[$i];
   if (defined $tokens[$i]) && $tokens[3] =~ /(\.(gif|jpg)$/i && $tokens[13] =~ /^[45]0\d$/ ) {
      splice(@log_file, $i, 1);
   }
}
untie @log_file;
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 14099620
I should have explained
   splice(@log_file, $i, 1);
is the line that removes the array element which in turn removes that line from the file.

Keep in mind, when modifing files that other programs have write access to, you're in a race condition and may end up loosing some entries or get a corrupted file.  That's why you would normally get a write lock on the file before making any updates.  Since this is the quick and dirty script (actually it's half way inbetween), it doesn't include the file lock on the log file.

Here's the documentation on the Tie::File module which gives info on how to lock the file.
http://search.cpan.org/~mjd/Tie-File-0.96/lib/Tie/File.pm

Additional modules and info for locking the file
http://search.cpan.org/~nwclark/perl-5.8.6/ext/Fcntl/Fcntl.pm
http://search.cpan.org/~muir/File-Flock-104.111901/lib/File/Flock.pm
0
 
LVL 28

Expert Comment

by:FishMonger
ID: 14099691
I just noticed a slight goof on my part.
>> unless they contain the status code of 40x (400, 401, 402, etc...)

I missed 'unless' and instead thought 'and'

change
   $tokens[13] =~
to
   $tokens[13] !~
0
 
LVL 28

Accepted Solution

by:
FishMonger earned 400 total points
ID: 14105135
I was just going back through some of the questions I've posted in and I see I made another goof.

   defined $tokens[$i]
should be
   defined $log_file[$i])
0

Featured Post

Upgrade your Question Security!

Add Premium security features to your question to ensure its privacy or anonymity. Learn more about your ability to control Question Security today.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

This article describes how to add a user-defined command button to the Windows 7 Explorer toolbar.  In the previous article (http://www.experts-exchange.com/A_2172.html), we saw how to put the Delete button back there where it belongs.  "Delete" is …
This article describes a technique for converting RTF (Rich Text Format) data to HTML and provides C++ source that does it all in just a few lines of code. Although RTF is coming to be considered a "legacy" format, it is still in common use... po…
This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA. http://www.experts-exchange.com/videos/1478/Excel-Error-Handlin…
In a question here at Experts Exchange (https://www.experts-exchange.com/questions/29062564/Adobe-acrobat-reader-DC.html), a member asked how to create a signature in Adobe Acrobat Reader DC (the free Reader product, not the paid, full Acrobat produ…
Suggested Courses

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question