How to process my website access rawlogs

Posted on 2011-05-04
Last Modified: 2012-05-11
Hi all,

I have a small personal web site which probably gets about 5 hits a day that are actually interesting, plus a lot from bots. I am actually interested in who the visitors are, i.e. their IP addresses, what they have looked at, and what they searched for. Currently I get this info from the raw logs which are mailed to me.

It is a bit of a pain to extract this kind of info from the Rawlog by eye. The rawlog looks like this: - - [20/Apr/2011:10:08:45 +0100] "GET /articles/CrowRavenKenwardEtAlForDistribution.pdf HTTP/1.1" 200 332877 "" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/20110319 Firefox/3.6.16" - - [20/Apr/2011:11:48:39 +0100] "GET /ben_kenward_cv.html HTTP/1.0" 200 8268 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp;"

Open in new window

The first line would be something I'm interested in, and the second not.

Can someone recommend a raw log analyser which can extract the info on which IP addresses have looked at what, filtering out the obvious bots? It would be really great if I could somehow forward it the emails I get - I would rather not have to try and configure the server so as to get the info out by some way other than email.


Question by:amorphia78
    LVL 16

    Expert Comment

    Have you thought about using 3rd party applications like Google Analytics or AWStats? It's probably a lot more robust than trying to filter out the data yourself.

    Author Comment


    Thanks for the suggestions. Both these solutions sound like I would have to install things on the server. Installing something onto the server isn't realistic for me right now. I really need something that I can just put the logs I get emailed directly into. Maybe cut or paste into a program installed on my windows local machine, or forwarding the emails would be really nice.


    LVL 29

    Accepted Solution

    Web Log Explorer by Exacttrend - that was one I use (locally).
    LVL 16

    Expert Comment

    For google analytics, you would not need to install anything on your server. All you'd need to do is register your website and insert a piece of code on each page.
    Is that an option for you?

    If not, the only other thing I can think of is creating a custom script to filter the unwanted data. If there was a way to export as a CSV then you could possibly filter out certain fields and entries... but if you're creating it from scratch, it would be a pretty hefty script to create.
    LVL 32

    Expert Comment

    GA gives only a partial view of the traffic, and for a very low traffic site the GA statistics are really screwed.
    AWStats or other log analysis tools can be installed on your PC and fed with the raw logs at your convenience.

    In this case, it may be better to get the logs via FTP than by email as FTP is relatively easy to automate.
    Alternatively manually collect the data from the emails and feed the data to the log analyzer.

    Author Closing Comment

    This one works on my local rawlog files, I have have tried it now. Thanks.

    Featured Post

    Highfive + Dolby Voice = No More Audio Complaints!

    Poor audio quality is one of the top reasons people don’t use video conferencing. Get the crispest, clearest audio powered by Dolby Voice in every meeting. Highfive and Dolby Voice deliver the best video conferencing and audio experience for every meeting and every room.

    Join & Write a Comment

    Accessibility and Usability are two concepts that seem to be closely related.  But, too many people seem to have a distorted perception of them. During last five years, those two words have come to the day-to-day work of almost every web develope…
    When it comes to showing a 404 error page to your visitors, you do not want that generic page to show, and you especially do not want your hosting provider’s ad error page to show either. In this article, I will show you how to enable the custom 40…
    Viewers will get an overview of the benefits and risks of using Bitcoin to accept payments. What Bitcoin is: Legality: Risks: Benefits: Which businesses are best suited?: Other things you should know: How to get started:
    The viewer will get a basic understanding of what section 508 compliance can entail, learn about skip navigation links, alt text, transcripts, and font size controls.

    733 members asked questions and received personalized solutions in the past 7 days.

    Join the community of 500,000 technology professionals and ask your questions.

    Join & Ask a Question

    Need Help in Real-Time?

    Connect with top rated Experts

    24 Experts available now in Live!

    Get 1:1 Help Now