• Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 1392
  • Last Modified:

accept() hangs at WAIT_CLOSE

I'm running a server listening at port 901 on Solaris 2.5.1.
For some unknown reaons, the server gets into a state where
the the listening socket hangs at accept().

When a client tries to connect to the server, it won't be
able to get response back. After I kill the client, and
do a "netstat -a | grep 901", I always see a listing like:

appultra9:901    ..... 0 WAIT_CLOSE

Any idea why this is happening ? how to find our more
information from the system ? Is there any socket setup
I need to use to avoid this from happening ?

1 Solution
have you checked the socket option  SO_REUSEADDR ?
see man setsockopt for details
vluewjlAuthor Commented:

Yes, it does have so_reuseaddr setting.  If a client tries to connect, netstat
does show connection establish. But WE THINK the server main thread doesn't
get out of accept(). When the client gets killed, the socket is half closed ...
The funny thing is although the server listens at port 901, but  netstat shows it has another idling port as:
      *.901                *.*                0      0     0      0 LISTEN
      *.901                *.*                0      0 38056      0 IDLE

 if we try "truss" on the process, it lists:

    sigtimedwait(0xeffff414, 0x00000000, 0x00000000) (sleeping ...)

instead of accept(..)..

Any idea ??




Sounds similar to a problem I'm struggling with:
  I also have to wait for about 10 secs after a port can be reused, not that it rejects new connections, they just wait ...
No more ideas :(
Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

Basically when client terminated or lost it leaves the tcp connection in CLOSE_WAIT..or WAIT_CLOSE (?) state which mean that the client end is lost and IP is waiting for the application to close the local FD..opened by client?!? You need to kill the application listening to the port to clear this state.

When you attempt to restart the application that listens to the port it gives a message saying address in use or Port in use. After a couple of mins it clears up and you can reuse it. This time is basically the timeout value which is required so that a new program does not get packets that were intended for the old program. You can modify this value using "ndd" command. The default value for /dev/tcp is 24000
ndd /dev/tcp
name to get/set ? tcp_close_wait_interval
value ?
length ?

You can set this by ndd -set /dev/tcp tcp_close_wait_interval 120000

The server connection enters the CLOSE_WAIT state after it
receives a FIN from the client.  This tells the server that
is can continue to send data to the client, but the client
will not be sending any data back.  The client would
normally do this by calling close() or shutdown().  To clear this state the local server would close() its socket which
would cause a FIN to be sent and change the state to
LAST_ACK until the client ACKs the servers FIN.  The
connection would then be CLOSED.  If you look on the client
you should see the connection in the FIN_WAIT_1 state until
the server calls close().

If you kill the server the socket will be closed the state
will be cleared. SO_REUSEADDR is a good idea, but will not
solve this problem.  The problem is that your code is not
handling the socket properly, and you are not closing the
socket after the client closes its end.  If you were reading
from the socket you would get an EOF.

It sounds to me like you are filling your listen queue and
that is why later connections seem to just connect and
freeze.  Try playing with different arguments to listen().
If you would like send me your email address off line and I will send you a simple example server that I use as a
starting point.

I can't really say more without more detail.  I would NOT
modify any of the TCP/IP setting in the OS.  This is usually
a bad idea unless you really know what you are doing.

email: davidc@guild.ab.ca
The socket spawnned from the accept() should be handled carefully. When it receives an error/close(i.e .when you recv <=0 bytes) from the client, you should explicitly close the socket using soclose(). Till that time the socket will be in CLOSE_WAIT state .Are you closing the spawnned socket ?

Featured Post

Never miss a deadline with monday.com

The revolutionary project management tool is here!   Plan visually with a single glance and make sure your projects get done.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now