asked on

Parse and analyse log file with PERL

Hi

I need to read a log file in this format (from squid proxy) :

1280628767.448 155251 192.168.100.225 TCP_MISS/503 902 GET http://googletb.skype.com/skypelist-en.xml renee DIRECT/googletb.skype.com text/html

and compile a list to a file of everyone who've gone over their qouta - for example 1GB. In the format above the 902 is the usage in bytes and "renee" is the user in that line.

The log files tend to a bit weighty - 500mb+ so there are quite a few lines to work through.
thank you

ASKER CERTIFIED SOLUTION

jeromee

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

QuintusSmit

ASKER

Hi jeromee - is the /proxy/path the path to the log file?

jeromee

correct.

QuintusSmit

ASKER

I get an error: Cant find string terminator "'" anywhere before EOF at -e line 1.

jeromee

Are you sure that you copy the line verbatim?
Which version of Perl do you have? (perl -v )

QuintusSmit

ASKER

thanx for the help so far.

this is the code as I use it: (copied and pasted)

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}' c:/access.log

the perl version is 5.10.1

I am working on a 64 bit system if that makes a difference? Also I just thought I should mention im running this on a windows version of perl. I will try it now on linux and see if it works.

QuintusSmit

ASKER

that was the problem - works great on linux server.

if you keep going like this minstrels will have to sing your praises soon.

Could you maybe give a quick overview of what it is I am actually doing with that line?

tx

jeromee

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'

Perl -ane: see perl -h:
% perl -h

Usage: /home/Perl/bin/perl [switches] [--] [programfile] [arguments]
-0[octal] specify record separator (\0, if no argument)
-a autosplit mode with -n or -p (splits $_ into @F)
-C enable native wide character system interfaces
-c check syntax only (runs BEGIN and CHECK blocks)
-d[:debugger] run program under debugger
-D[number/list] set debugging flags (argument is a bit mask or alphabets)
-e 'command' one line of program (several -e's allowed, omit programfile)
-F/pattern/ split() pattern for -a switch (//'s are optional)
-i[extension] edit <> files in place (makes backup if extension supplied)
-Idirectory specify @INC/#include directory (several -I's allowed)
-l[octal] enable line ending processing, specifies line terminator
-[mM][-]module execute `use/no module...' before executing program
-n assume 'while (<>) { ... }' loop around program
-p assume loop like -n but print line also, like sed
-P run program through C preprocessor before compilation
-s enable rudimentary parsing for switches after programfile
-S look for programfile using PATH environment variable
-T enable tainting checks
-u dump core after parsing program
-U allow unsafe operations
-v print version, subversion (includes VERY IMPORTANT perl info)
-V[:variable] print configuration summary (or a single Config.pm variable)
-w enable many useful warnings (RECOMMENDED)
-W enable all warnings
-X disable all warnings
-x[directory] strip off text before #!perl line and perhaps cd to directory

For the rest
$s{$F[7]}+=$F[4]; # @F is an array that's automatically created when using -a (autosplit)
the 7th place in the array is the username and the 4th is the number of bytes used
%s is a hash table and I'm using add up for any given username, the amount of bytes used

END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'
After going thru all the lines of the file (END{...})
we want to print all the users and associated bytes used
sort keys %s provides the sorted list of all users and $s{$_} is the associated bytes used
this $s{$_}>1_000_000_000 ? "OVER\n" : "\n" is equivalent to:
if( $s{$_} > 1_000_000_000 ) {
add "OVER\n" to the line
} else {
add "\n" to the line
}
and with the map statement is like a compact "foreach look"
like
foreach $_ (sort keys %s) {
print "$_ $s{$_}"....
}

I hope that's slightly clearer.

Happy Perling!

QuintusSmit

ASKER

uhuh - ofcourse..yes..now it all makes sense :)
i guess after all that typing you really want your points.

thank you for the help

jeromee

Sorry, I assumed that you had some knowledge of Perl and all you needed was for me to shed some light on the terseness of the one-liner.

In any case, I hope that I was at least able to demonstrate how powerful Perl can be.

Happy Perling!

QuintusSmit

ASKER

hey - nah I was joking there - it actually made sense after the explanation... I just hate that you make it look so easy. I have very basic coding background and only recently started with perl. I didnt even know about one liners so this is a whole new world to me.

Thank you for the help

jeromee

One-liners can be very powerful and I suggest that you start collecting them like recipes in your own cookbook... then you can reuse them, combine them and adapt them to future uses.

Good luck!