Start Free Trial

asked on

server couly not handle many clients - hp ux

Hello

Fromt the below code, server is creating new thread for every accept() call.

while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 ) //ACCEPT CALL
{
:
memcpy( worker_param, & new_client_fd, sizeof(int) );

/* kick off a thread to service the client */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)worker_entrypoint,
worker_param,
FALSE,
FALSE ) )

::::::::::
}

But if threads are going greater than 86, pthread_create() fails with rc 11
What modification would you suggest here? to avoid thi error?
Sham

The problem is either that your system could not support more than 86 threads at that time (due to memory size limits eg.), or the system's thread limit was reached.

Either way, 86 threads sounds like a lot. Are you sure it's optimal to have a separate thread for every accepted connection ?

'11' is EAGAIN (see errno.h) and according to the docs (https://computing.llnl.gov/tutorials/pthreads/#CreatingThreads and https://computing.llnl.gov/tutorials/pthreads/man/pthread_create.txt) this means

ERRORS
       The pthread_create() function shall fail if:

       EAGAIN The  system  lacked  the  necessary  resources to create another
              thread, or the system-imposed  limit  on  the  total  number  of
              threads in a process {PTHREAD_THREADS_MAX} would be exceeded.

Open in new window

ASKER

hello
I already know the reason why pthread-create fails
is there an alternative instead of creating threads for every accept.
sham

Since you are apparently using a thread pool, that one should take care about not creating a new thread on every 'accept()' - try adjusting the thread pool's size to use less than 86 threads.

Yes, but which alternative you use, depends on your use case.

What kind of workload does one connection generate ? What does your server do ?
How many concurrent connections does it need to support ? What's the average lifetime of a connection ?

ASKER

Hello infinity

"What kind of workload does one connection generate ?"
For each thread creation there will be send() an recv() calls running between client and server for 5 minutes atleast.

"What does your server do?"
On server side, each thread receives the commands from the clients and performs some logic internally and provide the reply back to client.

"How many concurrent connections does it need to support ?"
Server needs to maintain 500 concurrent connections at a time.
assignwork() was checking if any thread is avaiable in threadpool, if not, increase the pool size and then provide a new therad to worker.

" What's the average lifetime of a connection ? "
It is 5 minutes, if i refer to logs.

Sham

ASKER

hello infinity
time of 5 minutes , will check and confirm once again.
Probably due to huge traffic on port 1723 where server and client are talking on.
Sham

ASKER

Hello infinity
For your question: " What's the average lifetime of a connection ?"
Attached file says that after time stamp 06:10:02 load increases and then shows 5 minutes
time between
worker_entrypoint(): Entered
and
worker_entrypoint(): Leaving

Hope i answered all your questions.

Sham

ASKER

Hello jkr
I cannot constraint to threadpools size because the scenario is accept() call.
I have to wait on accept() despite huge number of connection requests.
Sham

ASKER

Hello infinity
I have attached the file
Sham

ccijimd.txt

Well, then you should check out how to increase PTHREAD_THREADS_MAX for your system. As far as I can see, that can be done by recompiling the pthread library.

ASKER

hello jkr
this is out of my scope and if threads are increased per process, the performance will go down
Sham

ASKER

hello infinity

More info...

int main( int argc, char * argv[] )
{
int rc;
/* everything is initialized, kick off the server thread */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)server_entrypoint,
NULL,
FALSE,
FALSE ) )
:
}

DWORD WINAPI server_entrypoint( LPVOID param )
{
int new_client_fd;
/* the main loop */
while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
{

memcpy( worker_param, & new_client_fd, sizeof(int) );

/* kick off a thread to service the client */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)worker_entrypoint,
worker_param,
FALSE,
FALSE ) )
} //end while
::
}

So, accept loop is running on a thread
Sham

ASKER CERTIFIED SOLUTION

membership

This solution is only available to members.

To access this solution, you must be a member of Experts Exchange.

Start Free Trial

ASKER

hello infinity
" If no worker is available, simply queue the command"
I already implemented this logic, But on client side, i should not make connect() call fail.
Always an accept call should be available.

Can't we think of forking the process and then use threads within the child process? I mean for every 30 accepts one process shoulfd handle the load. Each proces should run 30 worker threads.
My update 36896710 will let you know the current design.

Sham

>> Always an accept call should be available.

And it will be, because the main thread is doing that.

Connecting is a cheap operation. The main thread accepts the connection, and then stores it in some central storage. You can then make use of 'select' eg. (http://linux.die.net/man/2/select) to decide which sockets have a command waiting, and handle it (also in the main thread) by pushing it in the worker queue.

ASKER

hello infinity
If more than 86 threads are suppose to process at a time?
May be am still not clear.
Sham

ASKER

hello infinity
are you asking me sequentially monitor say 300 accept sockets , whichever is ready spawn worker?
Sham

>> are you asking me sequentially monitor say 300 accept sockets , whichever is ready spawn worker?

Pretty much, yes.

But you don't need to actually loop over all sockets, if you use something like 'select', which will return only those sockets that need your attention.

ASKER

hello infinity
Can u give the code snippet for select on multiple accept sockets?

Sham

There is a code sample in the link I pasted earlier.

A more extensive example can be found here :

http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#select

ASKER

hello infinity
so, I will select() on all the accept sockets and I will get which are busy.
after select(), if more than 86 accept sockets are busy then, what do I need to do?
Sham

queue all commands in the worker queue.

If your worker pool gets overloaded, you need to do some fine-tuning.

Note that this approach will only work well if the work load of a single command is relatively low. And it'll work even better if the command rate isn't too high.

ASKER

hello infinity
These commands inturn actually do send() and recv()
Sham

ASKER

hello infinity
it is my mistake to miscommunicate as commands, server just does send() and recv() back to clients.
For this scenario queue logic will not work.
Sham

ASKER

hello infinity
What do you think the solution would be, if server needs to send() or recv() with client .
Sham

>> What do you think the solution would be, if server needs to send() or recv() with client .

What is it sending and receiving ?

Or to get back to the question I asked originally :

>> What kind of workload does one connection generate ? What does your server do ?

ASKER

Hello infinity

Client send command init / term / inqy/ recv / term and server receieves this command.
When server receve init command, server does one send() and one recv() call.
When server receive term command, server does one recv() call.
when server receive inqy command, server does one recv() call.

worker_entrypoint() performs the above operation as a thread.

Sham

ASKER

Hello infinity
In addtion to my previous answer 36901440,

For your query: "What kind of workload does one connection generate ?"

If i review the logs, each worker_entrypoint() on server side is taking 5 minutes.

06:10:02 2011      server_entrypoint():thread_count: =19
06:10:02 2011::worker_entrypoint      worker_entrypoint(): Entered
06:15:28 2011::worker_entrypoint      worker_entrypoint(): Leaving

Sham

Workload refers to how much CPU time and/or I/O wait time and/or idle time is consumed for each connection.

Btw, what you describe are still simple request-response pairs, so why wouldn't that work with a worker pool ?

ASKER

Hello infinity
For your query:
"why wouldn't that work with a worker pool ? "
For each accept call, We launch a thread to respond.
If we have more than 86 accept(), then this is problem.

Sham

ASKER

Hello infinity
For your query:
"why wouldn't that work with a worker pool ? "
If we you see the below code, thread is invoked for each accept() call.
If you receive more than 86 accept() , then more than 86 threads created for a process.

int main( int argc, char * argv[] )
{
int rc;
/* everything is initialized, kick off the server thread */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)server_entrypoint,
NULL,
FALSE,
FALSE ) )
:
}

DWORD WINAPI server_entrypoint( LPVOID param )
{
int new_client_fd;
/* the main loop */
while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
{

memcpy( worker_param, & new_client_fd, sizeof(int) );

/* kick off a thread to service the client */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)worker_entrypoint,
worker_param,
FALSE,
FALSE ) )
} //end while
::
}

>> For each accept call, We launch a thread to respond.
>> If we have more than 86 accept(), then this is problem.

That is your CURRENT approach.

I'm talking about a different approach : a worker pool, rather than a thread pool.

Please read through all my responses in this thread again, because it seems I'm starting to repeat myself.

ASKER

hello infinity
R u talking about running select on multiple accept sockets?
Sham

ASKER

worker pool will create lot of delay because each accept() socket request takes 5 minutes.
Sham

Again, please read the comments I made earlier.

You don't use the worker threads for accepting connections (connections are accepted in the main thread) - you use them for running and responding to the commands.

ASKER

Hello infinity
May be i missed your comments, Can you please repeat?
Sham

http:#36896712
http:#36897420

ASKER

Hello infinity
for your point:
"Since you say that the workload are simple command-response pairs, I'll assume that each command doesn't take much time (on average) - correct me if I'm wrong."

There are maximum connections that stay alive for 5 minutes despite they use teh connection or not.
Sham

Yes, the connections, but not the commands. The commands themselves don't last long, do they ?

ASKER

Hello infinity
a single thread server is accepting the connections, so no problem.
But after accepting the connection conversing with client is the problem, Because for conversation with each client we are creating one seperate thread.
If there are 1000 clients, 1000 threads need to get created. folllwoing code tells that.

Instead of kicking off threads,what should i do?

while ( (new_client_fd = tc_getNewClientConnection( listen_fd )) != 0 )
{
drop_client = 0;
worker_param = NULL;
memcpy( worker_param, & new_client_fd, sizeof(int) );
/* kick off a thread to service the client */
if ( ! assignWork( jimd_threadpool_p,
(LPTHREAD_START_ROUTINE)worker_entrypoint,
worker_param,
FALSE,
FALSE ) )

}
Sham

ASKER

Let me tell u, client will wait for response from server for 30 seconds.

You're asking the same question over and over again. My answer hasn't changed. It's still the same : use a worker pool, instead of starting one thread per connection.

For more details, please read through this whole thread again very carefully.

ASKER

I did the modification as per what i understood, when u say worker pool.

i have taken 50 as threshold.

Please find the attached modifications.

Sham
original.txt change.txt

ASKER

Hello infiinity
In the above attached code writeen for hpux,
SetEvent() and waitforsingleobject() are userdefined functions
Please let me know, if i did what you discussed
Sham

ASKER

server_entrypoint() and queue_conrolpoint() are different threads.

ASKER

Hello infinity
Do you want worker_entrypoint() code to anlyse my changes?
Sham

That code shows that you put the client socket in a queue.

I can't tell if the rest of the code implements the approach I described, because I haven't seen it.

You need to continuously monitor all client connections for activity. If there's activity, handle it. One of the activities would be a command, in which case, you queue the command for execution by the worker pool. When the worker pool has a free worker, it executes the command, and responds with the result.

ASKER

Hello infinity
Here is the code which executes the client requests.

Sham

original.txt change.txt

ASKER

Hello infinity
i am not just placing the client socket in the queue am also launching thread until the worker pool does not exceed 50.queue_controlpoint() handles this.
Sham

So, you still have one thread per connection, only now you limit the total amount of threads to 50 ?

I'm not sure how that will help anything.

I don't think you've understood the concept of assigning COMMANDS to worker threads, rather than CONNECTIONS. A worker thread is supposed to execute ONE command, and then it becomes available again for another command (possibly for a different connection).

ASKER

so i need to sequentially select() each conenction soket and if any request came from that socket then run the command?

Sham

ASKER

select(fdmax+1, &read_fds, NULL, NULL, NULL)
like this?

for example, yes

ASKER

When i say command that is nothing but server does
send() call to lcient
And then recv() from client.
So, If the worker thread gets blocked in recv(), then?
Sham

We're not getting anywhere this way - I feel like we're going around in circles.

Either the workload is a series of commands that need to be executed (like you said earlier), or it's not.

If it's not, then the information you provided earlier was incorrect, and thus this whole thread was based on incorrect information.

If it is, then nothing has changed, and you're asking the same questions again, which will get the same answers.

ASKER

"Either the workload is a series of commands that need to be executed (like you said earlier), or it's not."
If the command is TC_CMDINIT, then we send() to client and recv() from client
If the command is TC_CMDTERM, then we send() recv() and close()

Sham

Then I'll ask again : what is it that your server is doing ?

I don't need to know what functions it calls, but rather what its purpose is, and how it achieves that.

ASKER

Hello infinity
I have raised anotherthread ID: 27385108

I will answer your questons there
Sham

ASKER

thanx