server couly not handle many clients - hp ux

Hello

Fromt the below code, server is creating new thread for every accept() call.
 
while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 ) //ACCEPT CALL
  {
   :
     memcpy( worker_param, & new_client_fd, sizeof(int) );

      /* kick off a thread to service the client */
      if ( ! assignWork( jimd_threadpool_p,
                         (LPTHREAD_START_ROUTINE)worker_entrypoint,
                         worker_param,
                         FALSE,
                         FALSE ) )

  ::::::::::    
    }

But if threads are going greater than 86, pthread_create() fails with rc 11
What modification would you suggest here? to avoid thi error?
Sham

mohet01Asked:
Who is Participating?
 
Infinity08Connect With a Mentor Commented:
>> Server needs to maintain 500 concurrent connections at a time.

That's a potential killer if you also need one thread per connection. Especially on a 32bit system.

Since you say that the workload are simple command-response pairs, I'll assume that each command doesn't take much time (on average) - correct me if I'm wrong.

In that case, I'd rather go with a worker pool rather than a thread pool. Ie. have a fixed amount of worker threads (eg. one or two per core - depending on many factors), each with an availability status.

Each time a command needs to be run, an available worker is picked from the worker pool, so it can run the command, and send the response. If no worker is available, simply queue the command until one becomes available (which should be soon, since commands don't last long on average).

The amount of workers can easily be fine-tuned based on the system load they generate, based on response times, etc.
0
 
Infinity08Commented:
The problem is either that your system could not support more than 86 threads at that time (due to memory size limits eg.), or the system's thread limit was reached.

Either way, 86 threads sounds like a lot. Are you sure it's optimal to have a separate thread for every accepted connection ?
0
 
jkrCommented:
'11' is EAGAIN (see errno.h) and according to the docs (https://computing.llnl.gov/tutorials/pthreads/#CreatingThreads and https://computing.llnl.gov/tutorials/pthreads/man/pthread_create.txt) this means
ERRORS
       The pthread_create() function shall fail if:

       EAGAIN The  system  lacked  the  necessary  resources to create another
              thread, or the system-imposed  limit  on  the  total  number  of
              threads in a process {PTHREAD_THREADS_MAX} would be exceeded.

Open in new window


0
Cloud Class® Course: Microsoft Exchange Server

The MCTS: Microsoft Exchange Server 2010 certification validates your skills in supporting the maintenance and administration of the Exchange servers in an enterprise environment. Learn everything you need to know with this course.

 
mohet01Author Commented:
hello
I already know the reason why pthread-create fails
is there an alternative instead of creating threads for every accept.
sham
0
 
jkrCommented:
Since you are apparently using a thread pool, that one should take care about not creating a new thread on every 'accept()' - try adjusting the thread pool's size to use less than 86 threads.
0
 
Infinity08Commented:
Yes, but which alternative you use, depends on your use case.

What kind of workload does one connection generate ? What does your server do ?
How many concurrent connections does it need to support ? What's the average lifetime of a connection ?
0
 
mohet01Author Commented:
Hello infinity

"What kind of workload does one connection generate ?"
For each thread creation there will be send() an recv() calls running between client and server for 5 minutes atleast.

"What does your server do?"
On server side, each thread receives the commands from the clients and performs some logic internally and provide the reply back to client.

"How many concurrent connections does it need to support ?"
Server needs to maintain 500 concurrent connections at a time.
assignwork() was checking if any thread is avaiable in threadpool, if not, increase the pool size and then provide a new therad to worker.

" What's the average lifetime of a connection ? "
It is 5 minutes, if i refer to logs.

Sham


0
 
mohet01Author Commented:
hello infinity
time of 5 minutes , will check and confirm once again.
Probably due to huge traffic on port 1723 where server and client are talking on.
Sham
0
 
mohet01Author Commented:
Hello infinity
For your question: " What's the average lifetime of a connection ?"
Attached file says that after time stamp 06:10:02 load increases and then shows 5 minutes
time between
worker_entrypoint(): Entered
and
worker_entrypoint(): Leaving

Hope i answered all your questions.

Sham
0
 
mohet01Author Commented:
Hello jkr
I cannot constraint to threadpools size because the scenario is accept() call.
I have to wait on accept() despite huge number of connection requests.
Sham
0
 
mohet01Author Commented:
Hello infinity
I have attached the file
Sham



 ccijimd.txt

0
 
jkrCommented:
Well, then you should check out how to increase PTHREAD_THREADS_MAX for your system. As far as I can see, that can be done by recompiling the pthread library.
0
 
mohet01Author Commented:
hello jkr
this is out of my scope and if threads are increased per process, the performance will go down
Sham
0
 
mohet01Author Commented:
hello infinity

More info...

int main( int argc, char * argv[] )
{
       int rc;
       /* everything is initialized, kick off the server thread */
        if ( ! assignWork( jimd_threadpool_p,
                     (LPTHREAD_START_ROUTINE)server_entrypoint,
                     NULL,
                     FALSE,
                     FALSE ) )
:
}


DWORD WINAPI server_entrypoint( LPVOID param )
{
        int new_client_fd;
        /* the main loop */
        while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
        {
 

                     memcpy( worker_param, & new_client_fd, sizeof(int) );

                    /* kick off a thread to service the client */
                      if ( ! assignWork( jimd_threadpool_p,
                                   (LPTHREAD_START_ROUTINE)worker_entrypoint,
                                     worker_param,
                                     FALSE,
                                     FALSE ) )
              } //end while
::
}


So, accept loop is running on a thread
Sham
0
 
mohet01Author Commented:
hello infinity
" If no worker is available, simply queue the command"
I already implemented this logic, But on client side, i should not make connect() call fail.
Always an accept call should be available.

Can't we think of forking the process and then use threads within the child process? I mean for every 30 accepts one process shoulfd handle the load. Each proces should run 30 worker threads.
My update 36896710 will let you know the current design.


Sham
0
 
Infinity08Commented:
>> Always an accept call should be available.

And it will be, because the main thread is doing that.

Connecting is a cheap operation. The main thread accepts the connection, and then stores it in some central storage. You can then make use of 'select' eg. (http://linux.die.net/man/2/select) to decide which sockets have a command waiting, and handle it (also in the main thread) by pushing it in the worker queue.
0
 
mohet01Author Commented:
hello infinity
If more than 86 threads are suppose to process at a time?
May be am still not clear.
Sham
0
 
mohet01Author Commented:
hello infinity
are you asking me sequentially monitor say 300 accept sockets , whichever is ready spawn worker?
Sham
0
 
Infinity08Commented:
>> are you asking me sequentially monitor say 300 accept sockets , whichever is ready spawn worker?

Pretty much, yes.

But you don't need to actually loop over all sockets, if you use something like 'select', which will return only those sockets that need your attention.
0
 
mohet01Author Commented:
hello infinity
Can u give the code snippet for select on multiple accept sockets?


Sham
0
 
Infinity08Commented:
There is a code sample in the link I pasted earlier.

A more extensive example can be found here :

        http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#select
0
 
mohet01Author Commented:
hello infinity
so, I will select() on all the accept sockets and I will get which are busy.
after select(), if more than 86 accept sockets are busy then, what do I need to do?
Sham
0
 
Infinity08Commented:
queue all commands in the worker queue.

If your worker pool gets overloaded, you need to do some fine-tuning.


Note that this approach will only work well if the work load of a single command is relatively low. And it'll work even better if the command rate isn't too high.
0
 
mohet01Author Commented:
hello infinity
These commands inturn actually do send() and recv()
Sham
0
 
mohet01Author Commented:
hello infinity
it is my mistake to miscommunicate as commands, server just does send() and recv() back to clients.
For this scenario queue logic will not work.
Sham
0
 
mohet01Author Commented:
hello infinity
What do you think the solution would be, if server needs to send() or recv() with client .
Sham
0
 
Infinity08Commented:
>> What do you think the solution would be, if server needs to send() or recv() with client .

What is it sending and receiving ?

Or to get back to the question I asked originally :

>> What kind of workload does one connection generate ? What does your server do ?
0
 
mohet01Author Commented:
Hello infinity

Client send command init / term / inqy/ recv / term and server receieves this command.
When server receve init command, server does one send() and one recv() call.
When server receive term command, server does one recv() call.
when server receive inqy command, server does one recv() call.

worker_entrypoint() performs the above operation as a thread.

Sham
0
 
mohet01Author Commented:
Hello infinity
In addtion to my previous answer  36901440,

For your query: "What kind of workload does one connection generate ?"

If i review the logs, each worker_entrypoint() on server side is taking 5 minutes.

06:10:02 2011      server_entrypoint():thread_count: =19
06:10:02 2011::worker_entrypoint      worker_entrypoint(): Entered
06:15:28 2011::worker_entrypoint      worker_entrypoint(): Leaving

Sham
0
 
Infinity08Commented:
Workload refers to how much CPU time and/or I/O wait time and/or idle time is consumed for each connection.

Btw, what you describe are still simple request-response pairs, so why wouldn't that work with a worker pool ?
0
 
mohet01Author Commented:
Hello infinity
For your query:
"why wouldn't that work with a worker pool ? "
For each accept call, We launch a thread to respond.
If we have more than 86 accept(), then this is problem.

Sham

0
 
mohet01Author Commented:
Hello infinity
For your query:
"why wouldn't that work with a worker pool ? "
If we you see the below code, thread is invoked for each accept() call.
If you receive more than 86 accept() , then more than 86 threads created for a process.

int main( int argc, char * argv[] )
{
       int rc;
       /* everything is initialized, kick off the server thread */
        if ( ! assignWork( jimd_threadpool_p,
                     (LPTHREAD_START_ROUTINE)server_entrypoint,
                     NULL,
                     FALSE,
                     FALSE ) )
:
}


DWORD WINAPI server_entrypoint( LPVOID param )
{
        int new_client_fd;
        /* the main loop */
        while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
        {
 

                     memcpy( worker_param, & new_client_fd, sizeof(int) );

                    /* kick off a thread to service the client */
                      if ( ! assignWork( jimd_threadpool_p,
                                   (LPTHREAD_START_ROUTINE)worker_entrypoint,
                                     worker_param,
                                     FALSE,
                                     FALSE ) )
              } //end while
::
}



0
 
Infinity08Commented:
>> For each accept call, We launch a thread to respond.
>> If we have more than 86 accept(), then this is problem.

That is your CURRENT approach.

I'm talking about a different approach : a worker pool, rather than a thread pool.

Please read through all my responses in this thread again, because it seems I'm starting to repeat myself.
0
 
mohet01Author Commented:
hello infinity
R u talking about running select on multiple accept sockets?
Sham
0
 
mohet01Author Commented:
worker pool will create lot of delay because each accept() socket request  takes 5 minutes.
Sham
0
 
Infinity08Commented:
Again, please read the comments I made earlier.

You don't use the worker threads for accepting connections (connections are accepted in the main thread) - you use them for running and responding to the commands.
0
 
mohet01Author Commented:
Hello infinity
May be i missed your comments, Can you please repeat?
Sham
0
 
Infinity08Commented:
0
 
mohet01Author Commented:
Hello infinity
for your point:
"Since you say that the workload are simple command-response pairs, I'll assume that each command doesn't take much time (on average) - correct me if I'm wrong."

There are maximum connections that stay alive for 5 minutes despite they use teh connection or not.
Sham
0
 
Infinity08Commented:
Yes, the connections, but not the commands. The commands themselves don't last long, do they ?
0
 
mohet01Author Commented:
Hello infinity
a single thread server is accepting the connections, so no problem.
But after accepting the connection conversing with client is the problem, Because for conversation with each client we are creating one seperate thread.
If there are 1000 clients, 1000 threads need to get created. folllwoing code tells that.

Instead of kicking off threads,what should i do?


 while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
  {
        drop_client = 0;
               worker_param = NULL;
           memcpy( worker_param, & new_client_fd, sizeof(int) );
          /* kick off a thread to service the client */
          if ( ! assignWork( jimd_threadpool_p,
                         (LPTHREAD_START_ROUTINE)worker_entrypoint,
                         worker_param,
                         FALSE,
                         FALSE ) )

 }
Sham
0
 
mohet01Author Commented:
Let me tell u, client will wait for response from server for 30 seconds.
0
 
Infinity08Commented:
You're asking the same question over and over again. My answer hasn't changed. It's still the same : use a worker pool, instead of starting one thread per connection.

For more details, please read through this whole thread again very carefully.
0
 
mohet01Author Commented:
I did the modification as per what i understood, when u say worker pool.

i have taken 50 as threshold.

Please find the attached modifications.

Sham
 original.txt change.txt
0
 
mohet01Author Commented:
Hello infiinity
In the above attached code writeen for hpux,
SetEvent() and waitforsingleobject() are userdefined functions
Please let me know, if i did what you discussed
Sham
0
 
mohet01Author Commented:
server_entrypoint() and queue_conrolpoint() are different threads.
0
 
mohet01Author Commented:
Hello infinity
Do you want worker_entrypoint() code to anlyse my changes?
Sham
0
 
Infinity08Commented:
That code shows that you put the client socket in a queue.

I can't tell if the rest of the code implements the approach I described, because I haven't seen it.

You need to continuously monitor all client connections for activity. If there's activity, handle it. One of the activities would be a command, in which case, you queue the command for execution by the worker pool. When the worker pool has a free worker, it executes the command, and responds with the result.
0
 
mohet01Author Commented:
Hello infinity
Here is the code which executes the client requests.

Sham

 original.txt change.txt
0
 
mohet01Author Commented:
Hello infinity
i am not just placing the client socket in the queue am also launching thread until the worker pool does not exceed 50.queue_controlpoint() handles this.
Sham
0
 
Infinity08Commented:
So, you still have one thread per connection, only now you limit the total amount of threads to 50 ?

I'm not sure how that will help anything.

I don't think you've understood the concept of assigning COMMANDS to worker threads, rather than CONNECTIONS. A worker thread is supposed to execute ONE command, and then it becomes available again for another command (possibly for a different connection).
0
 
mohet01Author Commented:
so i need to sequentially select() each conenction soket and if any request came from that socket then run the command?

Sham

0
 
mohet01Author Commented:
select(fdmax+1, &read_fds, NULL, NULL, NULL)
like this?
0
 
Infinity08Commented:
for example, yes
0
 
mohet01Author Commented:
When i say command that is nothing but server does
send() call to lcient
And then recv() from client.
So, If the worker thread gets blocked in recv(), then?
Sham
0
 
Infinity08Commented:
We're not getting anywhere this way - I feel like we're going around in circles.

Either the workload is a series of commands that need to be executed (like you said earlier), or it's not.

If it's not, then the information you provided earlier was incorrect, and thus this whole thread was based on incorrect information.

If it is, then nothing has changed, and you're asking the same questions again, which will get the same answers.
0
 
mohet01Author Commented:
"Either the workload is a series of commands that need to be executed (like you said earlier), or it's not."
If the command is TC_CMDINIT, then we  send() to client and recv() from client
If the command is TC_CMDTERM, then we send() recv() and close()

Sham
0
 
Infinity08Commented:
Then I'll ask again : what is it that your server is doing ?

I don't need to know what functions it calls, but rather what its purpose is, and how it achieves that.
0
 
mohet01Author Commented:
Hello infinity
I have raised anotherthread ID: 27385108

I will answer your questons there
Sham
0
 
mohet01Author Commented:
thanx
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.