Solved

Parse and analyse log file with PERL

Posted on 2010-08-17
12
574 Views
Last Modified: 2012-05-10
Hi

I need to read a log file in this format (from squid proxy) :

1280628767.448 155251 192.168.100.225 TCP_MISS/503 902 GET http://googletb.skype.com/skypelist-en.xml renee DIRECT/googletb.skype.com text/html

and compile a list to a file  of everyone who've gone over their qouta - for example 1GB. In the format above the 902 is the usage in bytes and "renee" is the user in that line.

The log files tend to a bit weighty - 500mb+ so there are quite a few lines to work through.
thank you
0
Comment
Question by:QuintusSmit
  • 6
  • 6
12 Comments
 
LVL 10

Accepted Solution

by:
jeromee earned 500 total points
ID: 33457054
try this one-liner


perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}' /proy/path

Open in new window

0
 
LVL 1

Author Comment

by:QuintusSmit
ID: 33457306
Hi jeromee - is the /proxy/path the path to the log file?
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33457380
correct.
0
Independent Software Vendors: We Want Your Opinion

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
LVL 1

Author Comment

by:QuintusSmit
ID: 33457431
I get an error: Cant find string terminator "'" anywhere before EOF at -e line 1.
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33458406
Are you sure that you copy the line verbatim?
Which version of Perl do you have? (perl -v )
0
 
LVL 1

Author Comment

by:QuintusSmit
ID: 33458717
thanx for the help so far.

this is the code as I use it: (copied and pasted)

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'  c:/access.log

the perl version is 5.10.1

I am working on a 64 bit system if that makes a difference? Also I just thought I should mention im running this on a windows version of perl. I will try it now on linux and see if it works.
0
 
LVL 1

Author Comment

by:QuintusSmit
ID: 33458866
that was the  problem - works great on linux server.

if you keep going like this minstrels will have to sing your praises soon.

Could you maybe give a quick overview of what it is I am actually doing with that line?

tx
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33459268
perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'

Perl -ane: see perl -h:
% perl -h

Usage: /home/Perl/bin/perl [switches] [--] [programfile] [arguments]
  -0[octal]       specify record separator (\0, if no argument)
  -a              autosplit mode with -n or -p (splits $_ into @F)
  -C              enable native wide character system interfaces
  -c              check syntax only (runs BEGIN and CHECK blocks)
  -d[:debugger]   run program under debugger
  -D[number/list] set debugging flags (argument is a bit mask or alphabets)
  -e 'command'    one line of program (several -e's allowed, omit programfile)
  -F/pattern/     split() pattern for -a switch (//'s are optional)
  -i[extension]   edit <> files in place (makes backup if extension supplied)
  -Idirectory     specify @INC/#include directory (several -I's allowed)
  -l[octal]       enable line ending processing, specifies line terminator
  -[mM][-]module  execute `use/no module...' before executing program
  -n              assume 'while (<>) { ... }' loop around program
  -p              assume loop like -n but print line also, like sed
  -P              run program through C preprocessor before compilation
  -s              enable rudimentary parsing for switches after programfile
  -S              look for programfile using PATH environment variable
  -T              enable tainting checks
  -u              dump core after parsing program
  -U              allow unsafe operations
  -v              print version, subversion (includes VERY IMPORTANT perl info)
  -V[:variable]   print configuration summary (or a single Config.pm variable)
  -w              enable many useful warnings (RECOMMENDED)
  -W              enable all warnings
  -X              disable all warnings
  -x[directory]   strip off text before #!perl line and perhaps cd to directory

For the rest
$s{$F[7]}+=$F[4]; # @F is an array that's automatically created when using -a (autosplit)
 the 7th place in the array is the username and the 4th is the number of bytes used
 %s is a hash table and I'm using add up for any given username, the amount of bytes used

 END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'
After going thru all the lines of the file (END{...})
we want to print all the users and associated bytes used
sort keys %s provides the sorted list of all users and $s{$_} is the associated bytes used
 this  $s{$_}>1_000_000_000 ? "OVER\n" : "\n" is equivalent to:
    if( $s{$_} > 1_000_000_000 ) {
       add "OVER\n" to the line
    } else {
              add "\n" to the line
    }
and with the map statement is like a compact "foreach look"
like
    foreach $_ (sort keys %s) {
     print "$_ $s{$_}"....
    }

I hope that's slightly clearer.

Happy Perling!
   
0
 
LVL 1

Author Comment

by:QuintusSmit
ID: 33459307
uhuh - ofcourse..yes..now it all makes sense :)
i guess after all that typing you really want your points.

thank you for the help
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33459342
Sorry, I assumed that you had some knowledge of Perl and all you needed was for me to shed some light on the terseness of the one-liner.

In any case, I hope that I was at least able to demonstrate how powerful Perl can be.

Happy Perling!

 
0
 
LVL 1

Author Comment

by:QuintusSmit
ID: 33478447
hey - nah I was joking there - it actually made sense after the explanation... I just hate that you make it look so easy. I have very basic coding background and only recently started with perl. I didnt even know about one liners so this is a whole new world to me.

Thank you for the help
0
 
LVL 10

Expert Comment

by:jeromee
ID: 33478827
One-liners can be very powerful and I suggest that you start collecting them like recipes in your own cookbook... then you can reuse them, combine them and adapt them to future uses.

Good luck!
0

Featured Post

Announcing the Most Valuable Experts of 2016

MVEs are more concerned with the satisfaction of those they help than with the considerable points they can earn. They are the types of people you feel privileged to call colleagues. Join us in honoring this amazing group of Experts.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
Remove Malware code from PHP file 6 98
Perl for loop for 2000 ms 7 109
remove duplicates from the csv file 13 118
Move Function in Perl Script 2 74
Many time we need to work with multiple files all together. If its windows system then we can use some GUI based editor to accomplish our task. But what if you are on putty or have only CLI(Command Line Interface) as an option to  edit your files. I…
I have been pestered over the years to produce and distribute regular data extracts, and often the request have explicitly requested the data be emailed as an Excel attachement; specifically Excel, as it appears: CSV files confuse (no Red or Green h…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question