We help IT Professionals succeed at work.

We've partnered with Certified Experts, Carl Webster and Richard Faulkner, to bring you two Citrix podcasts. Learn about 2020 trends and get answers to your biggest Citrix questions!Listen Now

x

Problems with bind()--how to "unbind"??

barryc
barryc asked
on
Medium Priority
1,233 Views
Last Modified: 2013-12-26
I'm trying to write daemon and I am hitting a few problems.
Right now the daemon has no problem binding to a port,
listening on the port, accepting the connection, and then
forking a child to handle the communication.  BUT, what I
would like the daemon to do is after handling N connections,
to then wait for its children to finish with any open
connections and then restart itself (just in case I have
any memory leaks or other resource wasting).  Well, I have
no problem waiting for the children to exit:
    while ((wait(NULL)!=-1);
but after I finish waiting I try to do:
    close(listenSD);  /* close socket used for listening */
    execvp(argv[0], argv);

this should replace the current process with a new copy of
itself and execute from the start, thereby freeing up any
wasted resourses (at least on UNIX).  WELL, when the new
process starts it can't bind to the port because it claims
an errno=125 (Address already in use).  BUT, I've already
closed the socket AND ended the old process that was bound
to the port by calling the execvp() function.  It seems as
if the system is reserving the port for longer than
neccessary.  If I try to restart the daemon by hand it
usually works, but when I try to exec the daemon from within
itself it _always_ tells me that the port is in use.  I have
even added code that attempts to bind for a specified number
of retries and sleeps in between retries:
    listenSD = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    for (i=0; i<=numRetries; i++) {
      err = bind(listenSD, (struct sockassr*)&serverAddres,
                 sizeof(serverAddress);
      if (err==0) break;
      sleep(10);
    }
    if (i>numRetries) {
      fprintf(stderr, "%s: Could not bind to port.\n",
              argv[0]);
      exit(-1);
    }
     .
     .
     .
How can I make my daemon "unbind" from the port??? Or is
there another solution?
Thanks,
Barry M. Caceres
barryc@alumni.caltech.edu
Comment
Watch Question

ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
My first thought was to try a sleep too, but you've already tried that.
How about trying a more decisive exit:
while ((wait(NULL)!=-1 || errno != ECHILD );
if( close(listenSD) ){ perror("close(listenSD)"); }
if( fork() ){
        exit(EXIT_SUCCESS);
}else{
        sleep(10);
      execvp(argv[0], argv);
}      

Author

Commented:
I've tried that too (forking, exiting the parent and
then using execvp--similar to system()).  I also tried the
following:
   /* before listening do a fork */
   childPID = fork();
   /* the parent waits for the child to complete and then
      restarts the server
   */
   if (childPID>0) {
     wait(childPID, NULL, 0);
     execvp(argv[0], argv);
     exit(0);
   }
   /* otherwise we need to listen */
   listenSD = socket(...);
   for (i=0;i<numRetries; i++) {
     if (bind(listenSD, ....)==0) break;
     sleep(10);
   }
   if (i==numRetries) {
      perror("failed bind");
      exit(-1);
   }
   listen(listenSD, 5);
   for (i=0; i<maxConnect; i++) {
     /* code that accepts, forks, and handle the
        connection */
   }
   exit(0);

This way the process that is exec'ing itself never even
bound to the port, and the waitpid() call ensures that
the process that was bound to the port had exited.  I
could add a sleep in there, but I have a sleep/retry loop
in the bind section--I can't see that making a difference.
Hmmmm..... I would think this would work, but no.

Author

Commented:
Adjusted points to 420
ozo
CERTIFIED EXPERT
Most Valuable Expert 2014
Top Expert 2015

Commented:
Since we seem to have been thinking of the same things, which didn't work,
maybe you've also tried this too, but can you try to see when
restarting the daemon by hand doesn't work?
Perhaps that could give a clue to what's different when it does.
Commented:
Piece o' cake.  There are a couple of ways around this; the first is to wait for the TCP TIME_WAIT value, but that's about 5 minutes.  (BTW, you can see why you can't bind to your port by running "netstat -an".  Look for your port number.  You'll see that after a connection finishes, it hangs around in "TIME_WAIT" state for several minutes.)

The other way is to set the socket option SO_REUSEADDR:

      int opt = 1;
      setsockopt( sock, SOL_SOCKET, SO_REUSEADDR,
                  (char *)&opt, sizeof(opt) );

Do this before you bind() the address and you'll be all set.  The downside is that, on some OSs, SO_REUSEADDR sets things so that some *other* process can also bind the same address, resulting in much confusion all around.

The best solution is to open the socket as early as possible, and then fork off your children.  The parent just hangs out, waiting for children to finish.  When one does (after servicing a certain number of requests, or after a certain idle period, or whatever), the parent spawns a new child.  The child inherits the socket (which the parent always holds), and you have no issues with re-opening it.  Since the parent does so little work (and no memory allocation), it's easy to make sure it doesn't leak.  Another benefit is that, if your socket is on a priveleged port, the parent can open it while running as root, and then do a setuid() to some safer UID -- the children can still inherit the port.  If you close() and then re-open() the port, the parent (at least) has to continue to run as root.

Not the solution you were looking for? Getting a personalized solution is easy.

Ask the Experts
dhm

Commented:
BTW, what I have described is a kind of parallel-processing daemon.  The parent's job (after creating the socket) is simply to keep enough children alive.  Each child has to accept() its own connections.  You should beware that on most OSs (DEC-Unix, IRIX, Linux, and HP-UX, in my experience) it works to have several processes blocked in an accept() on the same (inherited) socket.  However, on Solaris-2.5.1 (but not 2.4), if you have more than one process blocked in accept() on the same socket, a single connection appears to cause more than one process to wake up.  One of them completes the connection and continues on its merry way, but the others that were awakened go off to never-neverland.  They end up hung, and a debugger trace indicates that they're deep in some libc call tree, apparently waiting for the rest of the TCP handshake that the process that worked consummated.  Oh yes, and Sun is completely uninterested in the phenomenon.

The way I worked around it was to allocate a semaphore, which I use to lock the accept().  That way, even if several children wake up from select() to accept() a new connection, only one at a time will have a chance to try.  Presumably, the first one will get the connection; accept() for the rest will fail (you have to set O_NDELAY on the socket to get it to fail immediately).

I've encapsulated the whole daemon thing into a pair of c++ classes that I could probably be convinced to part with, if you really want to go whole-hog.

Author

Commented:
I tried that on IRIX 5.3 and it worked great.  Thanks!
Wish me luck when porting this thing to Solaris 2.5 and
SCO....

dhm

Commented:
Be careful on Solaris-2.5...I think that had the "multiple accepts cause grief" problem too.  (2.5.1 definitely has it, but 2.5 probably did too.)  If your parent just does accept-fork-accept-fork..., you're probably OK, but if you fork off a bunch of children as I described, you're in danger.  (The reason to pre-fork a bunch of children is that fork() is a fairly expensive call; it's better to do it before you have a client who's waiting for you to serve him.
Access more of Experts Exchange with a free account
Thanks for using Experts Exchange.

Create a free account to continue.

Limited access with a free account allows you to:

  • View three pieces of content (articles, solutions, posts, and videos)
  • Ask the experts questions (counted toward content limit)
  • Customize your dashboard and profile

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.