We have a server environment consisting of a load balance (hardware), 3 web servers and a single database server. Against the environment we have run 3 load tests, scaling up to 600 concurrent users over a 20 minute period. On each test the response time for the page load begins to degrade at exactly the same point, when the number of concurrent users hits 100.
Test one was done with two web servers balanced
Test two was done with three web servers balanced
Test three was done with three web servers balanced but with +2 cpu allocated
All three tests have exactly the same result. I have had the hosting provider for the infrastructure review the network and none of the hardware elements in the route have restrictions or limits on users and the network is handling the load with ease. The individual web servers are also handling the load, even at peak evenly and without maxing out CPU or memory. Its seems highly irregular that the drop-off point remains identical despite the increase in resource and has lead me to question whether there is a configuration or set-up default value which is reaching its limit (100) within the system.
This is a theory but it posing a major challenge to what should be a robust server set-up and I would appreciate some expert opinion on what could be causing the issue. In preparation I have already ensured I have New Relic APM data available for all three sessions as well as general hardware monitoring data.