maredzki
asked on
"The connection was reset" apache on centos 6.5
Hello,
running CentOS 6.5 on Plesk 12.0.18 and Apache 2.2.15 and getting sporadic "the connection was reset" on all sites on the server. It does not matter if it is a php or static html, it will still sporadically generate that error. I've looked at /var/logs and sifted through the logs there but to no avail.
Any ideas?
Thank you,
Marek
running CentOS 6.5 on Plesk 12.0.18 and Apache 2.2.15 and getting sporadic "the connection was reset" on all sites on the server. It does not matter if it is a php or static html, it will still sporadically generate that error. I've looked at /var/logs and sifted through the logs there but to no avail.
Any ideas?
Thank you,
Marek
ASKER
Gerwin,
The server really does not run anything heavy as far as CPU or RAM. Here is the top:
top - 09:12:43 up 18 days, 5:24, 1 user, load average: 0.02, 0.06, 0.02
Tasks: 136 total, 1 running, 135 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.3%sy, 0.0%ni, 94.4%id, 5.0%wa, 0.0%hi, 0.0%si, 0.3%st
Mem: 2046748k total, 1946280k used, 100468k free, 246704k buffers
Swap: 2104504k total, 900k used, 2103604k free, 1114836k cached
As you can see CPU is quiet, the only issue I would see from here is low available RAM but it has been like that for years.
The server really does not run anything heavy as far as CPU or RAM. Here is the top:
top - 09:12:43 up 18 days, 5:24, 1 user, load average: 0.02, 0.06, 0.02
Tasks: 136 total, 1 running, 135 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.3%sy, 0.0%ni, 94.4%id, 5.0%wa, 0.0%hi, 0.0%si, 0.3%st
Mem: 2046748k total, 1946280k used, 100468k free, 246704k buffers
Swap: 2104504k total, 900k used, 2103604k free, 1114836k cached
As you can see CPU is quiet, the only issue I would see from here is low available RAM but it has been like that for years.
How many connections are you serving though Apache? Maybe you have more connections than you Apache is configured for. Did you look in the Apache error and acces log file(s)? Anything worth mentioning?
ASKER
Log files show nothing out of the ordinary. Here is vmstat result:
[root@u16565516 ~]# vmstat 5 3
procs -----------memory--------- - ---swap-- -----io---- --system-- -----cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 900 83904 248424 1119788 0 0 31 67 21 18 7 4 89 1 0
0 1 900 83012 248468 1119788 0 0 4 93 182 153 9 4 85 2 0
0 0 900 83384 248492 1119924 0 0 21 37 338 281 19 9 70 2 0
How would you show the connections config on Apache?
[root@u16565516 ~]# vmstat 5 3
procs -----------memory---------
r b swpd free buff cache si so bi bo in cs us sy id wa st
0 0 900 83904 248424 1119788 0 0 31 67 21 18 7 4 89 1 0
0 1 900 83012 248468 1119788 0 0 4 93 182 153 9 4 85 2 0
0 0 900 83384 248492 1119924 0 0 21 37 338 281 19 9 70 2 0
How would you show the connections config on Apache?
netstat -st
(show tcp statistics)
(show tcp statistics)
ASKER
Attached is the result.
conns.txt
conns.txt
Can you try:
netstat -an | grep ":80" | wc -l
How many connections do you have?
netstat -an | grep ":80" | wc -l
How many connections do you have?
ASKER
netstat -an | grep ":80" | wc -l
7
7
518 packets collapsed in receive queue due to low socket buffer
Try to double network memory using sysctl.
Try to double network memory using sysctl.
ASKER
Which value?
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_f ilter = 1
# Do not accept source routing
net.ipv4.conf.default.acce pt_source_ route = 0
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
net.bridge.bridge-nf-call- ip6tables = 0
net.bridge.bridge-nf-call- iptables = 0
net.bridge.bridge-nf-call- arptables = 0
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536
# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled. See sysctl(8) and
# sysctl.conf(5) for more details.
# Controls IP packet forwarding
net.ipv4.ip_forward = 0
# Controls source route verification
net.ipv4.conf.default.rp_f
# Do not accept source routing
net.ipv4.conf.default.acce
# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0
# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1
# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1
# Disable netfilter on bridges.
net.bridge.bridge-nf-call-
net.bridge.bridge-nf-call-
net.bridge.bridge-nf-call-
# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536
# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536
# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736
# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
net.ipv4.tcp_mem
net.ipv4.tcp_rmem
net.ipv4.tcp_wmem
net.ipv4.tcp_rmem
net.ipv4.tcp_wmem
ASKER
Since these are not defined in the conf, is there a baseline or a way to derive the proper values?
All sysctl values are documented around kernel sources.
Do you have particularily long keepalve timeout? Your server generates lots of connection resets (mine somehow closes them properly)
Do you have particularily long keepalve timeout? Your server generates lots of connection resets (mine somehow closes them properly)
ASKER
To be frank it is almost out of the box vps and no settings were changed, esp TCP settings. I don't think I need a long keepalive as all pages are served PHP or HTML. What do you think?
Mass resets are strange. check /etc/httpd/conf/httpd.conf
KeepAlive On ? try flipping
KeepAliveTimeout 15..45 ? larger is bad...
KeepAlive On ? try flipping
KeepAliveTimeout 15..45 ? larger is bad...
ASKER
gheist, here are my settings:
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
It was at 60, doubled it to 120. Problem is that the timeouts are so sporadic that its hard to say that it is now working. I will keep testing in the next 24 hrs.
Snap netstat -st
Then in 24h
compare the resets vs connection opens
maybe it gets better
so problem is slow backend application.
Then in 24h
compare the resets vs connection opens
maybe it gets better
so problem is slow backend application.
ASKER
Compared it to the file attached here and still 518 in the low socket buffer, from what I understand no more has collapsed since the previous netstat capture.
I know most of our resets we see during saturday and sunday. I'd prefer to have this conversation open until after the weekend. Is there anything else you suggest to look at?
I know most of our resets we see during saturday and sunday. I'd prefer to have this conversation open until after the weekend. Is there anything else you suggest to look at?
OK, no problem for me
You can automate all netstat collection using e.g. mrtg.
You can automate all netstat collection using e.g. mrtg.
ASKER
I haven't seen any timeouts this weekend and it still shows 518. If it happens again, I will open another question referencing this one.
Thanks for your help!
Thanks for your help!
You need to overhaul application. Something is fishy slow in backend.
You can log time something like
logformat picky %U %D
According to W3C research anything above 10s will send user away
You can log time something like
logformat picky %U %D
According to W3C research anything above 10s will send user away
How many connections are you serving though Apache? Maybe you have more connections than you Apache is configured for. Did you look in the Apache error and acces log file(s)? Anything worth mentioning?