Parse and analyse log file with PERL


I need to read a log file in this format (from squid proxy) :

1280628767.448 155251 TCP_MISS/503 902 GET renee DIRECT/ text/html

and compile a list to a file  of everyone who've gone over their qouta - for example 1GB. In the format above the 902 is the usage in bytes and "renee" is the user in that line.

The log files tend to a bit weighty - 500mb+ so there are quite a few lines to work through.
thank you
Who is Participating?

[Webinar] Streamline your web hosting managementRegister Today

jeromeeConnect With a Mentor Commented:
try this one-liner

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}' /proy/path

Open in new window

QuintusSmitAuthor Commented:
Hi jeromee - is the /proxy/path the path to the log file?
Free Tool: Port Scanner

Check which ports are open to the outside world. Helps make sure that your firewall rules are working as intended.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

QuintusSmitAuthor Commented:
I get an error: Cant find string terminator "'" anywhere before EOF at -e line 1.
Are you sure that you copy the line verbatim?
Which version of Perl do you have? (perl -v )
QuintusSmitAuthor Commented:
thanx for the help so far.

this is the code as I use it: (copied and pasted)

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'  c:/access.log

the perl version is 5.10.1

I am working on a 64 bit system if that makes a difference? Also I just thought I should mention im running this on a windows version of perl. I will try it now on linux and see if it works.
QuintusSmitAuthor Commented:
that was the  problem - works great on linux server.

if you keep going like this minstrels will have to sing your praises soon.

Could you maybe give a quick overview of what it is I am actually doing with that line?

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'

Perl -ane: see perl -h:
% perl -h

Usage: /home/Perl/bin/perl [switches] [--] [programfile] [arguments]
  -0[octal]       specify record separator (\0, if no argument)
  -a              autosplit mode with -n or -p (splits $_ into @F)
  -C              enable native wide character system interfaces
  -c              check syntax only (runs BEGIN and CHECK blocks)
  -d[:debugger]   run program under debugger
  -D[number/list] set debugging flags (argument is a bit mask or alphabets)
  -e 'command'    one line of program (several -e's allowed, omit programfile)
  -F/pattern/     split() pattern for -a switch (//'s are optional)
  -i[extension]   edit <> files in place (makes backup if extension supplied)
  -Idirectory     specify @INC/#include directory (several -I's allowed)
  -l[octal]       enable line ending processing, specifies line terminator
  -[mM][-]module  execute `use/no module...' before executing program
  -n              assume 'while (<>) { ... }' loop around program
  -p              assume loop like -n but print line also, like sed
  -P              run program through C preprocessor before compilation
  -s              enable rudimentary parsing for switches after programfile
  -S              look for programfile using PATH environment variable
  -T              enable tainting checks
  -u              dump core after parsing program
  -U              allow unsafe operations
  -v              print version, subversion (includes VERY IMPORTANT perl info)
  -V[:variable]   print configuration summary (or a single variable)
  -w              enable many useful warnings (RECOMMENDED)
  -W              enable all warnings
  -X              disable all warnings
  -x[directory]   strip off text before #!perl line and perhaps cd to directory

For the rest
$s{$F[7]}+=$F[4]; # @F is an array that's automatically created when using -a (autosplit)
 the 7th place in the array is the username and the 4th is the number of bytes used
 %s is a hash table and I'm using add up for any given username, the amount of bytes used

 END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'
After going thru all the lines of the file (END{...})
we want to print all the users and associated bytes used
sort keys %s provides the sorted list of all users and $s{$_} is the associated bytes used
 this  $s{$_}>1_000_000_000 ? "OVER\n" : "\n" is equivalent to:
    if( $s{$_} > 1_000_000_000 ) {
       add "OVER\n" to the line
    } else {
              add "\n" to the line
and with the map statement is like a compact "foreach look"
    foreach $_ (sort keys %s) {
     print "$_ $s{$_}"....

I hope that's slightly clearer.

Happy Perling!
QuintusSmitAuthor Commented:
uhuh - it all makes sense :)
i guess after all that typing you really want your points.

thank you for the help
Sorry, I assumed that you had some knowledge of Perl and all you needed was for me to shed some light on the terseness of the one-liner.

In any case, I hope that I was at least able to demonstrate how powerful Perl can be.

Happy Perling!

QuintusSmitAuthor Commented:
hey - nah I was joking there - it actually made sense after the explanation... I just hate that you make it look so easy. I have very basic coding background and only recently started with perl. I didnt even know about one liners so this is a whole new world to me.

Thank you for the help
One-liners can be very powerful and I suggest that you start collecting them like recipes in your own cookbook... then you can reuse them, combine them and adapt them to future uses.

Good luck!
All Courses

From novice to tech pro — start learning today.