Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Parse and analyse log file with PERL

Posted on 2010-08-17
Medium Priority
Last Modified: 2012-05-10

I need to read a log file in this format (from squid proxy) :

1280628767.448 155251 TCP_MISS/503 902 GET renee DIRECT/ text/html

and compile a list to a file  of everyone who've gone over their qouta - for example 1GB. In the format above the 902 is the usage in bytes and "renee" is the user in that line.

The log files tend to a bit weighty - 500mb+ so there are quite a few lines to work through.
thank you
Question by:QuintusSmit
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 6
LVL 10

Accepted Solution

jeromee earned 2000 total points
ID: 33457054
try this one-liner

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}' /proy/path

Open in new window


Author Comment

ID: 33457306
Hi jeromee - is the /proxy/path the path to the log file?
LVL 10

Expert Comment

ID: 33457380
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!


Author Comment

ID: 33457431
I get an error: Cant find string terminator "'" anywhere before EOF at -e line 1.
LVL 10

Expert Comment

ID: 33458406
Are you sure that you copy the line verbatim?
Which version of Perl do you have? (perl -v )

Author Comment

ID: 33458717
thanx for the help so far.

this is the code as I use it: (copied and pasted)

perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'  c:/access.log

the perl version is 5.10.1

I am working on a 64 bit system if that makes a difference? Also I just thought I should mention im running this on a windows version of perl. I will try it now on linux and see if it works.

Author Comment

ID: 33458866
that was the  problem - works great on linux server.

if you keep going like this minstrels will have to sing your praises soon.

Could you maybe give a quick overview of what it is I am actually doing with that line?

LVL 10

Expert Comment

ID: 33459268
perl -ane'$s{$F[7]}+=$F[4]; END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'

Perl -ane: see perl -h:
% perl -h

Usage: /home/Perl/bin/perl [switches] [--] [programfile] [arguments]
  -0[octal]       specify record separator (\0, if no argument)
  -a              autosplit mode with -n or -p (splits $_ into @F)
  -C              enable native wide character system interfaces
  -c              check syntax only (runs BEGIN and CHECK blocks)
  -d[:debugger]   run program under debugger
  -D[number/list] set debugging flags (argument is a bit mask or alphabets)
  -e 'command'    one line of program (several -e's allowed, omit programfile)
  -F/pattern/     split() pattern for -a switch (//'s are optional)
  -i[extension]   edit <> files in place (makes backup if extension supplied)
  -Idirectory     specify @INC/#include directory (several -I's allowed)
  -l[octal]       enable line ending processing, specifies line terminator
  -[mM][-]module  execute `use/no module...' before executing program
  -n              assume 'while (<>) { ... }' loop around program
  -p              assume loop like -n but print line also, like sed
  -P              run program through C preprocessor before compilation
  -s              enable rudimentary parsing for switches after programfile
  -S              look for programfile using PATH environment variable
  -T              enable tainting checks
  -u              dump core after parsing program
  -U              allow unsafe operations
  -v              print version, subversion (includes VERY IMPORTANT perl info)
  -V[:variable]   print configuration summary (or a single variable)
  -w              enable many useful warnings (RECOMMENDED)
  -W              enable all warnings
  -X              disable all warnings
  -x[directory]   strip off text before #!perl line and perhaps cd to directory

For the rest
$s{$F[7]}+=$F[4]; # @F is an array that's automatically created when using -a (autosplit)
 the 7th place in the array is the username and the 4th is the number of bytes used
 %s is a hash table and I'm using add up for any given username, the amount of bytes used

 END{print map{"$_ $s{$_} ". ($s{$_}>1_000_000_000 ? "OVER\n" : "\n")} sort keys %s}'
After going thru all the lines of the file (END{...})
we want to print all the users and associated bytes used
sort keys %s provides the sorted list of all users and $s{$_} is the associated bytes used
 this  $s{$_}>1_000_000_000 ? "OVER\n" : "\n" is equivalent to:
    if( $s{$_} > 1_000_000_000 ) {
       add "OVER\n" to the line
    } else {
              add "\n" to the line
and with the map statement is like a compact "foreach look"
    foreach $_ (sort keys %s) {
     print "$_ $s{$_}"....

I hope that's slightly clearer.

Happy Perling!

Author Comment

ID: 33459307
uhuh - it all makes sense :)
i guess after all that typing you really want your points.

thank you for the help
LVL 10

Expert Comment

ID: 33459342
Sorry, I assumed that you had some knowledge of Perl and all you needed was for me to shed some light on the terseness of the one-liner.

In any case, I hope that I was at least able to demonstrate how powerful Perl can be.

Happy Perling!


Author Comment

ID: 33478447
hey - nah I was joking there - it actually made sense after the explanation... I just hate that you make it look so easy. I have very basic coding background and only recently started with perl. I didnt even know about one liners so this is a whole new world to me.

Thank you for the help
LVL 10

Expert Comment

ID: 33478827
One-liners can be very powerful and I suggest that you start collecting them like recipes in your own cookbook... then you can reuse them, combine them and adapt them to future uses.

Good luck!

Featured Post

Important Lessons on Recovering from Petya

In their most recent webinar, Skyport Systems explores ways to isolate and protect critical databases to keep the core of your company safe from harm.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

On Microsoft Windows, if  when you click or type the name of a .pl file, you get an error "is not recognized as an internal or external command, operable program or batch file", then this means you do not have the .pl file extension associated with …
There are many situations when we need to display the data in sorted order. For example: Student details by name or by rank or by total marks etc. If you are working on data driven based projects then you will use sorting techniques very frequently.…
Explain concepts important to validation of email addresses with regular expressions. Applies to most languages/tools that uses regular expressions. Consider email address RFCs: Look at HTML5 form input element (with type=email) regex pattern: T…
Six Sigma Control Plans

671 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question