Go Premium for a chance to win a PS4. Enter to Win

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 832
  • Last Modified:

Analyse Apache web server access logs to troubleshoot surge in bandwidth utilization

Hi


I noticed there's dramatic increase of network traffic on our link to our ISP
after our portal revamp, mainly traffic returning out to ISP/Internet.

Q1:
Am I right to say the last column of all the "GET ..." in Apache web server logs is the
actual number of bytes returned to the client (who uses different types of browsers
from Mozilla, Chrome, Safari, Firefox, iPhone, .....) ?

Q2:
is this number of bytes (indicated in the last column of "GET ..." - refer to 4 sample
GET...  extracted from our Apache access log) the actual amount of bytes that
travels across the WAN link to the ISP?

Q3:
Besides the "GET...." statements, what are things I should look out for (info that
can be found on our Apache web servers' access logs or  info from elsewhere)
 to find out what's chewing up the bandwidth?

Q4:
From the four sample GET .....   extracted from Apache access log below, can
someone give some analysis of the retrievals & if such large amount of data (of
more than 300MB) per GET is normal

GET

Sample huge retrievals for bytes returned to clients of more than 250Mbytes :
(extracted from Apache access*log) :

217.47.157.114 - - [13/Nov/2010:11:29:57 +0100] "GET /vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_timeout HTTP/1.1" 200 4313 "http://www.vs.de/vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_timeout" "Mozilla/4.0" "0" "310268769"

216.255.4.90   - - [13/Nov/2010:11:28:39 +0100] "GET /vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_UNI_NOC_1 HTTP/1.1 200 3432 "http://www.vs.de/vsPortal/scripts/menu.jsp" "Firefox/3.6.12" "0" "323454608"

112.226.250.141 - - [13/Nov/2010:11:30:05+0100] "GET /vsPortal/scripts/menu2.jsp HTTP/1.1" 200 4322 "http://www.vs.de/vsPortal/appmanager/nsp/default?_nfpb=true&_pageLabel=vsPortal_ECALLUP_APP" "Safari/534.7" "0" "300518115"

161.41.59.157  - - [13/Nov/2010:11:34:09 +0100] "GET /TecInc/VsUnit/Publish/vsunit/navy/733sir.ContentPar.0022.File.tmp/IPPTIMES%204th%20Issue.pdf HTTP/1.1 2041 "http://www.vs.de/TecInc/VsUnit/Publish/myunit/navy/733sir.html?accessType=N "Firefox/3.6.10" "0" "313841514"

0
sunhux
Asked:
sunhux
  • 4
  • 3
  • 2
4 Solutions
 
shalomcCommented:
Please find the Access logs directives in the Apache configuration, usually in the httpd.conf file.
We need both LogFormat and CustomLog

0
 
sunhuxAuthor Commented:
Assuming the last column (eg: "310268769" figure in above example) is the bytes
returned to client, is this the actual bytes/traffic that passes thru the WAN link to
the ISP & to the client's browser?  

Btw, do let me know if some sort of caching can be done to ease this high traffic
thingy.  I'm running Apache on RHES 4.x

Information provided earlier are sanitized.  I'll need to sanitize the httpd.conf  but a quick
extract follows for customlog section extracted from httpd.conf  :
(can't locate LogFormat section)

============== one of the instance's conf file =======================
#   Per-Server Logging:
      #   The home of a custom SSL log file. Use this when you want a
      #   compact non-error SSL logfile on a virtual host basis.
      CustomLog logs/ssl_request_log \
            "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b"
   
      <Directory "/var/www/html">
            Options -Indexes -Includes -MultiViews
            AllowOverride None
            Order allow,deny
            Allow from all

# Enable/disable the handling of HTTP/1.1 "Via:" headers.
            # ("Full" adds the server version; "Block" removes all outgoing Via: headers)
            # Set to one of: Off | On | Full | Block
            #
            #ProxyVia On
            
            #
            # To enable a cache of proxied content, uncomment the following lines.
            # See http://httpd.apache.org/docs-2.0/mod/mod_cache.html for more details.
            #
            #<IfModule mod_disk_cache.c>
            #   CacheEnable disk /
            #   CacheRoot "/var/cache/mod_proxy"
            #</IfModule>
            #

        ######### CACHING ######################
        ExpiresActive On
        ExpiresDefault "access plus 0 seconds"
        ExpiresByType text/css "access plus 1 day"
        ExpiresByType text/javascript "access plus 1 day"
        ExpiresByType image/gif "access plus 1 day"
        ExpiresByType image/jpg "access plus 1 day"
        ExpiresByType image/png "access plus 1 day"
        ExpiresByType application/x-shockwave-flash "access plus 1 day"


================== another iinstance's .conf file ==========================
# CustomLog logs/modsec_performance.log mperformance
# Custom application access log.
# TODO You should consider creating a custom access log. It could contain
#      One custom log should be used per application but if you want

        #CustomLog VSX_logs/ssl_vsconnect_access.log vhost
         CustomLog "|/usr/sbin/rotatelogs -l /etc/httpd/VSX_logs/ssl_nsconnect_a
ccess_%Y%m%d.log 86400" vhost
#   The home of a custom SSL log file. Use this when you want a
CustomLog logs/ssl_request_log \
        #   The home of a custom SSL log file. Use this when you want a
        CustomLog logs/ssl_request_log \
#        CustomLog logs/vsconnect_access.log vhost
         CustomLog "|/usr/sbin/rotatelogs -l /etc/httpd/VSX_logs/access-vsconnec
t_log_%Y%m%d.log 86400" combined
#CustomLog logs/deflate_log deflate

================== another iinstance's vsx_ssl.conf file ==========================
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cook
ie}i\" %T %D" vhost



=============== another instance's vhost_vsxconnect.conf file =======================
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cook
ie}i\" %T %D" vhost
#LogFormat '"%r" %{outstream}n/%{instream}n (%{ratio}n%%)' deflate
0
 
shalomcCommented:
it looks like your log entries are from one of the logs that are in the vhost format.

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" \"%{Cookie}i\" %T %D" vhost

the last column, %D , is not the actual number of bytes, but the time taken to serve the request, in microseconds.
In fact, none of the columns of data you have measure accurately the actual number of bytes.
You need to use %O for that :D

take a look here for a detailed explanation
http://httpd.apache.org/docs/2.2/mod/mod_log_config.html



0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 
shalomcCommented:
To judge by the very small sample, your apache sever is not the culprit.

Do you have a firewall or router that connects you to the internet? those can be configured to save traffic logs, too.
Maybe you have some rogue services  in your network? or even infected computers that act as bots or worm propagators?
Such unwanted activity tends to chew up bandwidth...
0
 
gr8gonzoConsultantCommented:
shalomc is correct regarding the format and using %O to measure the bytes sent by the web server.

If you have an unusually huge amount of bandwidth being taken up, there may be a correlation with a program sucking up CPU time, too. Try running the "top" command and pay attention to what the top CPU-using processes are. See if any of them are suspicious or unfamiliar.

The netstat tool will show you the currently-open socket connections. If you have dozens and dozens of pages of socket connections, you may have something like a torrent client or some download site running without your knowledge.

You can also use the iptables firewall to monitor bandwidth:
http://www.linux.com/learn/tutorials/305767-bandwidth-monitoring-with-iptables

You're going to have to try a few things to narrow down the possibilities, unfortunately, There's no magic bullet here.
0
 
shalomcCommented:
I guess that they have some datacenter with many servers, and the apache server was the immediate suspect for the bandwidth surge.

monitor your network and you are likely to find the real cause
0
 
sunhuxAuthor Commented:

Yes, apache web servers is the suspect because only web servers channel traffic out
to the Internet.  Apps & DB servers don't send data out.  netstat & tcpdump tools on the
web servers do not help, thus I look at the apache logs & I think it's the right place to look at

Still need to answer the following question:
Assuming the last column (eg: "310268769" figure in above example) is the bytes
returned to client, is this the actual bytes/traffic that passes thru the WAN link to
the ISP & to the client's browser?  
0
 
gr8gonzoConsultantCommented:
The last column is, per shalomc's comment:

"the last column, %D , is not the actual number of bytes, but the time taken to serve the request, in microseconds."
0
 
sunhuxAuthor Commented:
ok thanks
0

Featured Post

NEW Veeam Backup for Microsoft Office 365 1.5

With Office 365, it’s your data and your responsibility to protect it. NEW Veeam Backup for Microsoft Office 365 eliminates the risk of losing access to your Office 365 data.

  • 4
  • 3
  • 2
Tackle projects and never again get stuck behind a technical roadblock.
Join Now