"The connection was reset" apache on centos 6.5

Hello,
running CentOS 6.5 on Plesk 12.0.18 and Apache 2.2.15 and getting sporadic "the connection was reset" on all sites on the server. It does not matter if it is a php or static html, it will still sporadically generate that error. I've looked at /var/logs and sifted through the logs there but to no avail.

Any ideas?

Thank you,
Marek
LVL 2
maredzkiAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Gerwin Jansen, EE MVETopic Advisor Commented:
Is the server very busy when this happens? Check with top for CPU intensive processes.

How many connections are you serving though Apache? Maybe you have more connections than you Apache is configured for. Did you look in the Apache error and acces log file(s)? Anything worth mentioning?
0
maredzkiAuthor Commented:
Gerwin,
The server really does not run anything heavy as far as CPU or RAM. Here is the top:

top - 09:12:43 up 18 days,  5:24,  1 user,  load average: 0.02, 0.06, 0.02
Tasks: 136 total,   1 running, 135 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.3%sy,  0.0%ni, 94.4%id,  5.0%wa,  0.0%hi,  0.0%si,  0.3%st
Mem:   2046748k total,  1946280k used,   100468k free,   246704k buffers
Swap:  2104504k total,      900k used,  2103604k free,  1114836k cached

As you can see CPU is quiet, the only issue I would see from here is low available RAM but it has been like that for years.
0
Gerwin Jansen, EE MVETopic Advisor Commented:
How many connections are you serving though Apache? Maybe you have more connections than you Apache is configured for. Did you look in the Apache error and acces log file(s)? Anything worth mentioning?
0
Newly released Acronis True Image 2019

In announcing the release of the 15th Anniversary Edition of Acronis True Image 2019, the company revealed that its artificial intelligence-based anti-ransomware technology – stopped more than 200,000 ransomware attacks on 150,000 customers last year.

maredzkiAuthor Commented:
Log files show nothing out of the ordinary. Here is vmstat result:
[root@u16565516 ~]# vmstat 5 3
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0    900  83904 248424 1119788    0    0    31    67   21   18  7  4 89  1  0
 0  1    900  83012 248468 1119788    0    0     4    93  182  153  9  4 85  2  0
 0  0    900  83384 248492 1119924    0    0    21    37  338  281 19  9 70  2  0

How would you show the connections config on Apache?
0
gheistCommented:
netstat -st
(show tcp statistics)
0
maredzkiAuthor Commented:
Attached is the result.
conns.txt
0
Gerwin Jansen, EE MVETopic Advisor Commented:
Can you try:

netstat -an | grep ":80" | wc -l

How many connections do  you have?
0
maredzkiAuthor Commented:
netstat -an | grep ":80" | wc -l
7
0
gheistCommented:
518 packets collapsed in receive queue due to low socket buffer

Try to double network memory using sysctl.
0
maredzkiAuthor Commented:
Which value?

# Kernel sysctl configuration file for Red Hat Linux
#
# For binary values, 0 is disabled, 1 is enabled.  See sysctl(8) and
# sysctl.conf(5) for more details.

# Controls IP packet forwarding
net.ipv4.ip_forward = 0

# Controls source route verification
net.ipv4.conf.default.rp_filter = 1

# Do not accept source routing
net.ipv4.conf.default.accept_source_route = 0

# Controls the System Request debugging functionality of the kernel
kernel.sysrq = 0

# Controls whether core dumps will append the PID to the core filename.
# Useful for debugging multi-threaded applications.
kernel.core_uses_pid = 1

# Controls the use of TCP syncookies
net.ipv4.tcp_syncookies = 1

# Disable netfilter on bridges.
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 0
net.bridge.bridge-nf-call-arptables = 0

# Controls the default maxmimum size of a mesage queue
kernel.msgmnb = 65536

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296
0
gheistCommented:
net.ipv4.tcp_mem
net.ipv4.tcp_rmem
net.ipv4.tcp_wmem
0
maredzkiAuthor Commented:
Since these are not defined in the conf, is there a baseline or a way to derive the proper values?
0
gheistCommented:
All sysctl values are documented around kernel sources.

Do you have particularily long keepalve timeout? Your server generates lots of connection resets (mine somehow closes them properly)
0
maredzkiAuthor Commented:
To be frank it is almost out of the box vps and no settings were changed, esp TCP settings. I don't think I need a long keepalive as all pages are served PHP or HTML. What do you think?
0
gheistCommented:
Mass resets are strange. check /etc/httpd/conf/httpd.conf
KeepAlive On ? try flipping
KeepAliveTimeout 15..45 ? larger is bad...
0
maredzkiAuthor Commented:
gheist, here are my settings:

KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
0
gheistCommented:
So keepalive is not used.
But still connections are reset.

Can you check Timeout in same file and double it?
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
maredzkiAuthor Commented:
It was at 60, doubled it to 120. Problem is that the timeouts are so sporadic that its hard to say that it is now working. I will keep testing in the next 24 hrs.
0
gheistCommented:
Snap netstat -st
Then in 24h
compare the resets vs connection opens

maybe it gets better

so problem is slow backend application.
0
maredzkiAuthor Commented:
Compared it to the file attached here and still 518 in the low socket buffer, from what I understand no more has collapsed since the previous netstat capture.

I know most of our resets we see during saturday and sunday. I'd prefer to have this conversation open until after the weekend. Is there anything else you suggest to look at?
0
gheistCommented:
OK, no problem for me

You can automate all netstat collection using e.g. mrtg.
0
maredzkiAuthor Commented:
I haven't seen any timeouts this weekend and it still shows 518. If it happens again, I will open another question referencing this one.

Thanks for your help!
0
gheistCommented:
You need to overhaul application. Something is fishy slow in backend.
You can log time something like
logformat picky %U %D
According to W3C research anything above 10s will send user away
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux

From novice to tech pro — start learning today.