Solved

Analyse Apache web server access logs to troubleshoot surge in bandwidth utilization

Posted on 2010-11-20
9
822 Views
Last Modified: 2012-08-13
Hi


I noticed there's dramatic increase of network traffic on our link to our ISP
after our portal revamp, mainly traffic returning out to ISP/Internet.

Q1:
Am I right to say the last column of all the "GET ..." in Apache web server logs is the
actual number of bytes returned to the client (who uses different types of browsers
from Mozilla, Chrome, Safari, Firefox, iPhone, .....) ?

Q2:
is this number of bytes (indicated in the last column of "GET ..." - refer to 4 sample
GET...  extracted from our Apache access log) the actual amount of bytes that
travels across the WAN link to the ISP?

Q3:
Besides the "GET...." statements, what are things I should look out for (info that
can be found on our Apache web servers' access logs or  info from elsewhere)
 to find out what's chewing up the bandwidth?

Q4:
From the four sample GET .....   extracted from Apache access log below, can
someone give some analysis of the retrievals & if such large amount of data (of
more than 300MB) per GET is normal

GET

Sample huge retrievals for bytes returned to clients of more than 250Mbytes :
(extracted from Apache access*log) :

217.47.157.114 - - [13/Nov/2010:11:29:57 +0100] "GET /vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_timeout HTTP/1.1" 200 4313 "http://www.vs.de/vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_timeout" "Mozilla/4.0" "0" "310268769"

216.255.4.90   - - [13/Nov/2010:11:28:39 +0100] "GET /vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_UNI_NOC_1 HTTP/1.1 200 3432 "http://www.vs.de/vsPortal/scripts/menu.jsp" "Firefox/3.6.12" "0" "323454608"

112.226.250.141 - - [13/Nov/2010:11:30:05+0100] "GET /vsPortal/scripts/menu2.jsp HTTP/1.1" 200 4322 "http://www.vs.de/vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_ECALLUP_APP" "Safari/534.7" "0" "300518115"

161.41.59.157  - - [13/Nov/2010:11:34:09 +0100] "GET /TecInc/VsUnit/Publish/vsunit/navy/733sir.ContentPar.0022.File.tmp/IPPTIMES%204th%20Issue.pdf HTTP/1.1 2041 "http://www.vs.de/TecInc/VsUnit/Publish/myunit/navy/733sir.html?accessType=N "Firefox/3.6.10" "0" "313841514"

0
Comment
Question by:sunhux
  • 4
  • 3
  • 2
9 Comments
 
LVL 32

Accepted Solution

by:
shalomc earned 250 total points
Comment Utility
Please find the Access logs directives in the Apache configuration, usually in the httpd.conf file.
We need both LogFormat and CustomLog

0
 

Author Comment

by:sunhux
Comment Utility
Assuming the last column (eg: "310268769" figure in above example) is the bytes
returned to client, is this the actual bytes/traffic that passes thru the WAN link to
the ISP & to the client's browser?  

Btw, do let me know if some sort of caching can be done to ease this high traffic
thingy.  I'm running Apache on RHES 4.x

Information provided earlier are sanitized.  I'll need to sanitize the httpd.conf  but a quick
extract follows for customlog section extracted from httpd.conf  :
(can't locate LogFormat section)

============== one of the instance's conf file =======================
#   Per-Server Logging:
      #   The home of a custom SSL log file. Use this when you want a
      #   compact non-error SSL logfile on a virtual host basis.
      CustomLog logs/ssl_request_log \
            "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
   
      <Directory "/var/www/html">
            Options -Indexes -Includes -MultiViews
            AllowOverride None
            Order allow,deny
            Allow from all

# Enable/disable the handling of HTTP/1.1 "Via:" headers.
            # ("Full" adds the server version; "Block" removes all outgoing Via: headers)
            # Set to one of: Off | On | Full | Block
            #
            #ProxyVia On
            
            #
            # To enable a cache of proxied content, uncomment the following lines.
            # See http://httpd.apache.org/docs-2.0/mod/mod_cache.html for more details.
            #
            #<IfModule mod_disk_cache.c>
            #   CacheEnable disk /
            #   CacheRoot "/var/cache/mod_proxy"
            #</IfModule>
            #

        ######### CACHING ######################
        ExpiresActive On
        ExpiresDefault "access plus 0 seconds"
        ExpiresByType text/css "access plus 1 day"
        ExpiresByType text/javascript "access plus 1 day"
        ExpiresByType image/gif "access plus 1 day"
        ExpiresByType image/jpg "access plus 1 day"
        ExpiresByType image/png "access plus 1 day"
        ExpiresByType application/x-shockwave-flash "access plus 1 day"


================== another iinstance's .conf file ==========================
# CustomLog logs/modsec_performance.log mperformance
# Custom application access log.
# TODO You should consider creating a custom access log. It could contain
#      One custom log should be used per application but if you want

        #CustomLog VSX_logs/ssl_vsconnect_access.log vhost
         CustomLog "|/usr/sbin/rotatelogs -l /etc/httpd/VSX_logs/ssl_nsconnect_a
ccess_%Y%m%d.log 86400" vhost
#   The home of a custom SSL log file. Use this when you want a
CustomLog logs/ssl_request_log \
        #   The home of a custom SSL log file. Use this when you want a
        CustomLog logs/ssl_request_log \
#        CustomLog logs/vsconnect_access.log vhost
         CustomLog "|/usr/sbin/rotatelogs -l /etc/httpd/VSX_logs/access-vsconnec
t_log_%Y%m%d.log 86400" combined
#CustomLog logs/deflate_log deflate

================== another iinstance's vsx_ssl.conf file ==========================
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cook
ie}i\" %T %D" vhost



=============== another instance's vhost_vsxconnect.conf file =======================
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cook
ie}i\" %T %D" vhost
#LogFormat '"%r" %{outstream}n/%{instream}n (%{ratio}n%%)' deflate
0
 
LVL 32

Assisted Solution

by:shalomc
shalomc earned 250 total points
Comment Utility
it looks like your log entries are from one of the logs that are in the vhost format.

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" %T %D" vhost

the last column, %D , is not the actual number of bytes, but the time taken to serve the request, in microseconds.
In fact, none of the columns of data you have measure accurately the actual number of bytes.
You need to use %O for that :D

take a look here for a detailed explanation
http://httpd.apache.org/docs/2.2/mod/mod_log_config.html



0
 
LVL 32

Expert Comment

by:shalomc
Comment Utility
To judge by the very small sample, your apache sever is not the culprit.

Do you have a firewall or router that connects you to the internet? those can be configured to save traffic logs, too.
Maybe you have some rogue services  in your network? or even infected computers that act as bots or worm propagators?
Such unwanted activity tends to chew up bandwidth...
0
VMware Disaster Recovery and Data Protection

In this expert guide, you’ll learn about the components of a Modern Data Center. You will use cases for the value-added capabilities of Veeam®, including combining backup and replication for VMware disaster recovery and using replication for data center migration.

 
LVL 34

Assisted Solution

by:gr8gonzo
gr8gonzo earned 250 total points
Comment Utility
shalomc is correct regarding the format and using %O to measure the bytes sent by the web server.

If you have an unusually huge amount of bandwidth being taken up, there may be a correlation with a program sucking up CPU time, too. Try running the "top" command and pay attention to what the top CPU-using processes are. See if any of them are suspicious or unfamiliar.

The netstat tool will show you the currently-open socket connections. If you have dozens and dozens of pages of socket connections, you may have something like a torrent client or some download site running without your knowledge.

You can also use the iptables firewall to monitor bandwidth:
http://www.linux.com/learn/tutorials/305767-bandwidth-monitoring-with-iptables

You're going to have to try a few things to narrow down the possibilities, unfortunately, There's no magic bullet here.
0
 
LVL 32

Expert Comment

by:shalomc
Comment Utility
I guess that they have some datacenter with many servers, and the apache server was the immediate suspect for the bandwidth surge.

monitor your network and you are likely to find the real cause
0
 

Author Comment

by:sunhux
Comment Utility

Yes, apache web servers is the suspect because only web servers channel traffic out
to the Internet.  Apps & DB servers don't send data out.  netstat & tcpdump tools on the
web servers do not help, thus I look at the apache logs & I think it's the right place to look at

Still need to answer the following question:
Assuming the last column (eg: "310268769" figure in above example) is the bytes
returned to client, is this the actual bytes/traffic that passes thru the WAN link to
the ISP & to the client's browser?  
0
 
LVL 34

Assisted Solution

by:gr8gonzo
gr8gonzo earned 250 total points
Comment Utility
The last column is, per shalomc's comment:

"the last column, %D , is not the actual number of bytes, but the time taken to serve the request, in microseconds."
0
 

Author Closing Comment

by:sunhux
Comment Utility
ok thanks
0

Featured Post

Why You Should Analyze Threat Actor TTPs

After years of analyzing threat actor behavior, it’s become clear that at any given time there are specific tactics, techniques, and procedures (TTPs) that are particularly prevalent. By analyzing and understanding these TTPs, you can dramatically enhance your security program.

Join & Write a Comment

This article shows how a content item can be identified directly or through translation of a navigation type. It then shows how this information can be used to create a menu for further navigation.
PRTG Network Monitor lets you monitor your bandwidth usage, so you know who is using up your bandwidth, and what they're using it for.
Here's a very brief overview of the methods PRTG Network Monitor (https://www.paessler.com/prtg) offers for monitoring bandwidth, to help you decide which methods you´d like to investigate in more detail.  The methods are covered in more detail in o…
This video gives you a great overview about bandwidth monitoring with SNMP and WMI with our network monitoring solution PRTG Network Monitor (https://www.paessler.com/prtg). If you're looking for how to monitor bandwidth using netflow or packet s…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now