[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 322
  • Last Modified:

How to process my website access rawlogs

Hi all,

I have a small personal web site which probably gets about 5 hits a day that are actually interesting, plus a lot from bots. I am actually interested in who the visitors are, i.e. their IP addresses, what they have looked at, and what they searched for. Currently I get this info from the raw logs which are mailed to me.

It is a bit of a pain to extract this kind of info from the Rawlog by eye. The rawlog looks like this:

host109-153-168-229.range109-153.btcentralplus.com - - [20/Apr/2011:10:08:45 +0100] "GET /articles/CrowRavenKenwardEtAlForDistribution.pdf HTTP/1.1" 200 332877 "http://www.google.co.uk/url?sa=t&source=web&cd=14&ved=0CCwQFjADOAo&url=http%3A%2F%2Fwww.benkenward.com%2Farticles%2FCrowRavenKenwardEtAlForDistribution.pdf&rct=j&q=corvids%20cache%20evolution&ei=laKuTcrOLYezhAfchtDdAw&usg=AFQjCNG5EMXp2KuLaZHRYkKmi1gybfRdbA" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv: Gecko/20110319 Firefox/3.6.16"
b3091174.crawl.yahoo.net - - [20/Apr/2011:11:48:39 +0100] "GET /ben_kenward_cv.html HTTP/1.0" 200 8268 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"

Open in new window

The first line would be something I'm interested in, and the second not.

Can someone recommend a raw log analyser which can extract the info on which IP addresses have looked at what, filtering out the obvious bots? It would be really great if I could somehow forward it the emails I get - I would rather not have to try and configure the server so as to get the info out by some way other than email.


1 Solution
Have you thought about using 3rd party applications like Google Analytics or AWStats? It's probably a lot more robust than trying to filter out the data yourself.

amorphia78Author Commented:

Thanks for the suggestions. Both these solutions sound like I would have to install things on the server. Installing something onto the server isn't realistic for me right now. I really need something that I can just put the logs I get emailed directly into. Maybe cut or paste into a program installed on my windows local machine, or forwarding the emails would be really nice.


Web Log Explorer by Exacttrend - that was one I use (locally).
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

For google analytics, you would not need to install anything on your server. All you'd need to do is register your website and insert a piece of code on each page.
Is that an option for you?

If not, the only other thing I can think of is creating a custom script to filter the unwanted data. If there was a way to export as a CSV then you could possibly filter out certain fields and entries... but if you're creating it from scratch, it would be a pretty hefty script to create.
GA gives only a partial view of the traffic, and for a very low traffic site the GA statistics are really screwed.
AWStats or other log analysis tools can be installed on your PC and fed with the raw logs at your convenience.

In this case, it may be better to get the logs via FTP than by email as FTP is relatively easy to automate.
Alternatively manually collect the data from the emails and feed the data to the log analyzer.
amorphia78Author Commented:
This one works on my local rawlog files, I have have tried it now. Thanks.

Featured Post


Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now