OK, basic background first.
I run a pretty simple web hosting business. Fairly small as they go. I have 6 servers at a datacenter in Florida. I'm currently having a problem with 3 of the 6 servers.
2 of the 3 servers are dedicated to a single user. Each is running Fedora Core 8.
Here's the versions of relevant software on each:
Server 1 (sugar):
Apache 2.2.14 (worker MPM)
PHP 5.2.11 as fcgi apache module
Server 2 (lily)
Apache 2.2.15 (worker MPM)
PHP 5.2.13 as fcgi apache module
The third server is a shared server with about 20 users. It is also running Fedora Core 8.
Server 3 (aurora)
Apache 2.2.8 (worker MPM)
PHP 5.2.6 (running as CGI but without fcgi)
I'm experiencing nearly the same problem on all three servers. Let me preface by saying that, prior to about a week ago, there were NO reported problems. However, since last week, I've been getting reports of the following:
ZIP file downloads are corrupted
Web pages, whether static or dynamic, will sometimes render improperly (see image sugar1.JPG)
Images embedded in web pages will sometimes render improperly (see image sugar2.JPG)
A few times, FireFox would give message "Content Encoding error" (see image sugar3.jpg) Please note that ALL compression on the server is disabled- the apache deflate module is disabled, and php zlib compression is disabled. In all cases of this error, refreshing the page caused it to load properly.
Now, what's really strange is that neither I, nor anyone else I know who has a computer, are able to reproduce this problem. Of the thousands of users viewing these sites, these issues only seem to happen to a handful of people. In some cases, upgrading the end-user's browser has solved the issue, but in many it hasn't. We've cleared cookies, cache, rebooted cable modem's and DSL modems.
None of the server software has been updated recently. There are no auto-updates enabled.
As you can see from the images, it has to be pretty low level to cause that kind of corruption in a simple HTML only webpage.
In addition, I've engaged the datacenter support staff who've run every test they know on the network equipment and found no issue, and even moved one of the servers to a different rack on a different switch which didn't solve the problem.
I'm thinking it almost has to be one of 2 things: Either a common network segment issue, or a common server configuration issue. Though I'm still at a loss as to why it would have only started manifesting recently.
I'm at my wits end here and would greatly appreciate ANY help or suggestions.