I'm trying to plan an SLA, and I want to verify my math is right in terms of just straight Ethernet latency.
Assuming we want to be able to transfer an average of 58,783 bytes (~57k) over TCP, and receive the full response in 200ms:
For a single ethernet frame to arrive at a destination, it should take around .3ms on an unloaded network (seems to be the generally accepted value for base Ethernet latency). Assuming a 1500 MTU, a TCP packet with a fully scaled window should be able to hold 1458 bytes. Chopping up that response into TCP packets, it should take 41 packets in one direction (58783/1458), multiply by 2 to account for acknowledgements which becomes 82 total packets exchanged (ignoring handshake, assume persistent connections and fully scaled windows).
On an unloaded network, the full transfer should take 24.6ms (82 * .3). Well under our 200ms maximum. So far so good. Now for concurrency and bandwidth:
Assuming this happens on a gigabit ethernet link, it should be able to transfer 134,217,728 bytes in one direction per second, or 40,265 bytes every .3ms (the ethernet latency period). Each .3ms timeslice can hold 26 TCP packets (40265/1500) and every multiple of 26 TCP packets above that should result in frame queuing and double the latency.
I believe if we're shooting for 200ms and under, it can be sustained at 211 concurrent requests (200/24.6*26) and saturate the gigabit link. At 1,000 concurrent requests the latency should be 1.056 seconds (1000/(26/24.6)).
If any of my constants or formulas are crap, let me know. :)