asked on

What is max concurrent TCP connections?

I am doing the initial design on a system which has the potential for 100k+ concurrent TCP connections, and I want to be sure I am not designing my way down a blind alley.

The general traffic per client will be minimal -- only the standard TCP "tickle" packets to maintain the connection. I am hoping that because the traffic per connection is so small I can get away with a single server (cost is a serious issue here). However, is there a limit to the number of concurrent TCP connections a single server can maintain? As to horsepower, we can assume a fairly powerful machine, say along the lines of a well equipped dual 750 MHz Pentium or something.

1) Is there any inherent TCP/IP limit? I know there are only 64k sockets, but as I understand TCP/IP (and I may be wrong) only 1 server socket is involved.

2) Would you expect the CPU to hit a physical limit? For example, would such a system typically choke because it would run out of CPU cycles, require an outrageous amount of RAM, be I/O bound, etc.

Thanks in advance,
parkerea

Les Moore

Since there are only 65,000 or so available ports used in TCP, I would think that would be the theoretical limit on simultaneous connections...

dickc82

A problom you may run into with a single server is login. How many pc log in at a time. I have had many compaints from the penny pinchers with one server that it takes forever to login.

Droby10

1) ultimately this depends on memory, your program's ability to handle connection requests appropriately and may also depend on a system limit on the maximum number of threads a parent process may have, and or a max connections (what server os are you using?)...

typically in a multiuser service application there is a single listening socket, when a connection request is received on this socket it accepts it on behalf and assigns it to another socket for handling (same port - different socket states), typically this new socket will be handled in a new thread...and the listening socket will continue to wait for the next connection attempt.

2) if the threading model is correct, then i don't see cpu usage being an issue at all, depending on the process time of performing the actual request handling (ie. database or file retreival). i would however really look at memory usage in comparision to an increasing number of simultaneous connections (this is probably where you're going to have problems)

BlackDiamond

Droby is correct. You will need to be very careful about how the connections are handled. You will not be able to run 100,000 concurrent threads, so you will probably want to start up the application and create maybe 10 or 15 waiting threads in a group for connection processing. You will need 1 thread that does nothing but accept connections as fast as it can, and pass the accepted socket off to one of the waiting threads. The trick is to accept the connections faster than you fill your socket buffers and start dropping packets, as well as manage all your processing threads efficiently.

Also, you will need to maintain and balance a queue of connections in each thread, and check each connection for data (probably a read() call). It is usually nicer to use a blocking call to the socket so you don't eat up 100% cpu all the time, but I can't think of an easy way to do that with a small number of threads and that many connections.

Hmmm, anyone here know an elegant way to set up callbacks from a connected socket?

rspiteri

I think this sort of question raises more questions than answers, what type of OS are you using, Unix based OS's such as linux and solaris are supposed to scale better than NT for example, TCP/IP by its nature is fairly flexible. After all if you find the your box cannot handle the concurrent sessions as your load increases, you can add more servers or network cards, you could presenet more than one tcp/ip stack, and if necessary use some load balancing software to present a single ip address to the world

Droby10

both of the above comments are firm solutions to providing the scale of service you are recommending...

the use of thread pooling, or even allowing multiples of a single thread to handle a sweeping io across multiple connections, as blackdiamond recommended, will help you get to your target performance...the only real problem i see with doing the multi-io on a single sweep is the use of blocking calls ( ie. read...which in a traditional model would be a good thing )...if a connection is silent then you won't be able to process reading the other's until the that one "speaks", then you'll have buffer's fill (costing more memory), and eventually dead sockets. you can get around this by checking the input buffer length prior to attempting to read. the lack of a blocking call will of course add to the thread cycle and you will need to use an appropriate sleep call to control it...

to elaborate a little on rspiteri suggestion...you may want to separate you're network application from you're actual data application...and use com/corba to allow multiple network handling servers to query and return information from a single or small group of data handling servers. personally i'm a dcom fan, but if you're in a unix environment that doesn't support com marshalling corba would perform just as well.

and i probably was not as clear on something as i should have been...100K connections on a single machine would overload the processor, without question...but the likelyhood of you getting to that point is next to none. long before you would ever reach that point, the amount of memory required to hold a tcp connection would not exist, and you would experience some form of shutdown....to avoid reaching these limits, it's always a good idea to define and use a max_connections within your application.

parkerea

ASKER

Ouch. This is getting more complex than originally anticipated. To think that my original (failed) design was a very low traffic, simple connectionless UDP design. Regardless, it still sounds like this may be feasible under TCP, although the single server idea is questionable.

Once again, I would like to say how much I appreciate all your input. If this works, a lot of electric utility customers will benefit.

BTW, being a 4GL client/server developer, I have never talked directly to TCP/IP, so my knowledge is very spotty and strictly theoretical. If this design does not hit a brick wall, that might change. I am proposing this to my company, but I receive no compensation for it. Because it is outside my department, so I can only hope I would be involved.

Now, back to the issue:

The OS would probably be either: NT on a dual or quad Pentium, or AIX on an IBM RS6000. From what I understand, the RS6000 line has a wide span, with servers from mid-range to very high end.

It is a little late, but I should summarize the app for everyone (except Droby10, who answered my previous posting in another topic area). This app would notify California electric utility customers whose block is about to be hit by a rolling blackout. Think of this as an Internet based pager, where the server supports 100k+ clients. Despite the large number of clients, the vast majority never send or receive a message; only a few hundred or so messages are ever sent, but time is critical when they are sent.

It goes like this:

An electric utility customer hits a web page and enters some basic logon info. The client's ID is saved in a database, and the server downloads a Java applet to the client. The Java applet makes this persistent TCP connection to the server and waits for a long, long time. Most likely the client will disconnect at the end of the day with no message -- no app level traffic except the logon and logoff (when the browser page closes). So, the bulk of the traffic consists of: 100k each of logons, Java app downloads, & logoffs, plus the periodic TCP "keep alive" tickle packets (which I assume the TCP/IP stack deals with, so the app thread never sees them).

Occasionally, the utility company is told they have 10 minutes to reduce load. They determine which customers are to be hit, query the database for those who are logged on, sends messages to the affected clients, which are received by the Java applet, and the applet GUI notifies the customer. The customer then tells people not to use elevators, shuts down critical systems, etc.

Because outages are such rare events, and affect a very small percentage of the 100k logged on clients, I believe notification traffic can be essentially ignored. This lack of traffic should allow a thread can handle numerous connections, although as Droby10 & BlackDiamond said, blocking calls could not be used. I assume some method for handling dropped connections would also be needed.

Q1) As I understand it, the connection does not continue on the original listening socket, but is reassigned to another socket. So, does this mean that one IP address is limited to ~64k TCP connections (one connection per server socket), or does the TCP/IP protocol stack define the connection by: server IP + server socket + client IP + client socket? i.e. can multiple TCP connections share the same server socket?

Q2) Is there a difference between ports & sockets? Although I know "socket" is the correct term for TCP/IP, I seem to hear them used fairly interchangeably.

Q3) Any idea what methodology could be used to guesstimate how many servers would be needed? The only thing I can see is to build it & hit it with LoadRunner, but that sounds a bit expensive. I would rather have a ballpark estimate up front.

Thanks again!
- parkerea

ASKER CERTIFIED SOLUTION

Droby10

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

parkerea

ASKER

Given that there is no hard limit (i.e. 64k is not an issue), and this is not a common design, I suppose testing is the only way to go: code up the core of the server program as discussed above, just enough to accept connections & pass them off to another thread, then start throwing connections at it, say 1k at a crack, recording the effect each batch of connections has on memory consumption, CPU utilization, etc. I assume C/C++ would be the appropriate language, since my suspicion is Java may ?protect? us a bit too much from the underlying layers, although if it does not then Java might be worth a try since the app?s CPU time should be small and we may need to be super efficient, especially in this connection test.

If we get that far (and I really hope we do), I will re-post the results to this question so you can get an e-mail notification and check them out. Getting resources from our IT department will certainly be an issue, since this will require a few groups be involved: server management, testing lab, and somebody to write the server app. The latter would probably be the most difficult to find in our company -- we have plenty of application programmers, but I?m pretty sure our server software is all off the shelf. Of course, our financial situation compounds all these problems.

I think this covers all my network questions. Again, I can?t thank you all enough.

- parkerea

parkerea

ASKER

Almost forgot to award the points! Your answers were so complete that I up'ed the award.

BlackDiamond: your input was extremely useful also, so if you are interested in points I would be glad to open a new question just to award some to you -- just respond & let me know.

Thanks,
- parkerea

BlackDiamond

Don't worry about the points. Glad my comments could be of use. :->

parkerea

ASKER

PROGRESS REPORT 1:

Just a quick update for anyone who is interested. Good news (on the app front), and so-so news (on the political front in my company).

THE APPLICATION

I am in the process of coding up the fundamental server & client, mainly to demonstrate feasibility. This being my first Java system, it is fairly primitive, but it is working so far. Basically following BlackDiamond's design (thank you), the server has 2 pieces. I will ignore the client here, since it presents no technical issues.

1. Server main():

Creates a linkedList collection of sockets (a.k.a "connections" in the code), then fires off the connectionWatcher thread. As a new socket is accepted, main adds the new socket to the collection via a connectionWatcher method. BTW, semantically, is main() really considered a thread?

2. Server connectionWatcher Thread:

Scans the socket collection, checking for incoming data. The non-blocking call was made possible thanks to the Java available() method of class InputStream. Stripping out all non-relevant code (little things like error checking, processing the input, shutdown, etc.), the connectionWatcher thread boils down to this:
boolean listening = true
List connections
Socket a_socket = null;
InputStream a_inStream = null;
BufferedReader in = null;
String inputLine = "";

while (listening)
{
lit = connections.listIterator();
while (lit.hasNext())
{
a_socket = (Socket) lit.next();
a_inStream = a_socket.getInputStream();
if (a_inStream.available() > 0)
{
in = new BufferedReader(new
InputStreamReader(a_inStream));
inputLine = in.readLine();
... process inputLine ...
}
}
yield();
}

Although I have only thrown a handful of concurrent connections at it so far, at that level it scans pretty damn fast: ~3M sockets per second on my 1Ghz PC. Next, I will try hitting it with batches of 1k connections to see how it scales.

I don't yet see a need for a thread pool to handle the connection collection; one thread seems sufficient. I am considering a second thread to act as watchdog, to detect if connectionWatcher ends up blocked, although I am not yet sure how I would handle such a situation.

I still foresee two hurdles:
1) Dealing with firewalls, and
2) Holding connections for hours, detecting if they abnormally drop, then graceful recovering on both the server & client.

POLITICS

I submitted my suggestion to my company via the appropriate channels, and it comes as no surprise that I got a lukewarm "Thanks, but no thanks" initial response. Their response was not based on any technical issues (in fact, the group that responded is non-technical), nor was it based on a lack of need. They said it overlapped with other plans they are working on. Since I mentioned I was creating it on my own, they did want to know if progress was made, so if this app really works, I expect them to warm up to it.

parkerea

ASKER

Oops. The scan rate of "~3M sockets per second" should have read "~3M sockets per minute," which is more like 50k sockets per second. Still, quite sufficient, and far faster than I expected.

- parkerea