Problems with bind()--how to "unbind"??

I'm trying to write daemon and I am hitting a few problems.
Right now the daemon has no problem binding to a port,
listening on the port, accepting the connection, and then
forking a child to handle the communication.  BUT, what I
would like the daemon to do is after handling N connections,
to then wait for its children to finish with any open
connections and then restart itself (just in case I have
any memory leaks or other resource wasting).  Well, I have
no problem waiting for the children to exit:
    while ((wait(NULL)!=-1);
but after I finish waiting I try to do:
    close(listenSD);  /* close socket used for listening */
    execvp(argv[0], argv);

this should replace the current process with a new copy of
itself and execute from the start, thereby freeing up any
wasted resourses (at least on UNIX).  WELL, when the new
process starts it can't bind to the port because it claims
an errno=125 (Address already in use).  BUT, I've already
closed the socket AND ended the old process that was bound
to the port by calling the execvp() function.  It seems as
if the system is reserving the port for longer than
neccessary.  If I try to restart the daemon by hand it
usually works, but when I try to exec the daemon from within
itself it _always_ tells me that the port is in use.  I have
even added code that attempts to bind for a specified number
of retries and sleeps in between retries:
    listenSD = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    for (i=0; i<=numRetries; i++) {
      err = bind(listenSD, (struct sockassr*)&serverAddres,
                 sizeof(serverAddress);
      if (err==0) break;
      sleep(10);
    }
    if (i>numRetries) {
      fprintf(stderr, "%s: Could not bind to port.\n",
              argv[0]);
      exit(-1);
    }
     .
     .
     .
How can I make my daemon "unbind" from the port??? Or is
there another solution?
Thanks,
Barry M. Caceres
barryc@alumni.caltech.edu
barrycAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

ozoCommented:
My first thought was to try a sleep too, but you've already tried that.
How about trying a more decisive exit:
while ((wait(NULL)!=-1 || errno != ECHILD );
if( close(listenSD) ){ perror("close(listenSD)"); }
if( fork() ){
        exit(EXIT_SUCCESS);
}else{
        sleep(10);
      execvp(argv[0], argv);
}      
0
barrycAuthor Commented:
I've tried that too (forking, exiting the parent and
then using execvp--similar to system()).  I also tried the
following:
   /* before listening do a fork */
   childPID = fork();
   /* the parent waits for the child to complete and then
      restarts the server
   */
   if (childPID>0) {
     wait(childPID, NULL, 0);
     execvp(argv[0], argv);
     exit(0);
   }
   /* otherwise we need to listen */
   listenSD = socket(...);
   for (i=0;i<numRetries; i++) {
     if (bind(listenSD, ....)==0) break;
     sleep(10);
   }
   if (i==numRetries) {
      perror("failed bind");
      exit(-1);
   }
   listen(listenSD, 5);
   for (i=0; i<maxConnect; i++) {
     /* code that accepts, forks, and handle the
        connection */
   }
   exit(0);

This way the process that is exec'ing itself never even
bound to the port, and the waitpid() call ensures that
the process that was bound to the port had exited.  I
could add a sleep in there, but I have a sleep/retry loop
in the bind section--I can't see that making a difference.
Hmmmm..... I would think this would work, but no.
0
barrycAuthor Commented:
Adjusted points to 420
0
Cloud Class® Course: Certified Penetration Testing

This CPTE Certified Penetration Testing Engineer course covers everything you need to know about becoming a Certified Penetration Testing Engineer. Career Path: Professional roles include Ethical Hackers, Security Consultants, System Administrators, and Chief Security Officers.

ozoCommented:
Since we seem to have been thinking of the same things, which didn't work,
maybe you've also tried this too, but can you try to see when
restarting the daemon by hand doesn't work?
Perhaps that could give a clue to what's different when it does.
0
dhmCommented:
Piece o' cake.  There are a couple of ways around this; the first is to wait for the TCP TIME_WAIT value, but that's about 5 minutes.  (BTW, you can see why you can't bind to your port by running "netstat -an".  Look for your port number.  You'll see that after a connection finishes, it hangs around in "TIME_WAIT" state for several minutes.)

The other way is to set the socket option SO_REUSEADDR:

      int opt = 1;
      setsockopt( sock, SOL_SOCKET, SO_REUSEADDR,
                  (char *)&opt, sizeof(opt) );

Do this before you bind() the address and you'll be all set.  The downside is that, on some OSs, SO_REUSEADDR sets things so that some *other* process can also bind the same address, resulting in much confusion all around.

The best solution is to open the socket as early as possible, and then fork off your children.  The parent just hangs out, waiting for children to finish.  When one does (after servicing a certain number of requests, or after a certain idle period, or whatever), the parent spawns a new child.  The child inherits the socket (which the parent always holds), and you have no issues with re-opening it.  Since the parent does so little work (and no memory allocation), it's easy to make sure it doesn't leak.  Another benefit is that, if your socket is on a priveleged port, the parent can open it while running as root, and then do a setuid() to some safer UID -- the children can still inherit the port.  If you close() and then re-open() the port, the parent (at least) has to continue to run as root.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
dhmCommented:
BTW, what I have described is a kind of parallel-processing daemon.  The parent's job (after creating the socket) is simply to keep enough children alive.  Each child has to accept() its own connections.  You should beware that on most OSs (DEC-Unix, IRIX, Linux, and HP-UX, in my experience) it works to have several processes blocked in an accept() on the same (inherited) socket.  However, on Solaris-2.5.1 (but not 2.4), if you have more than one process blocked in accept() on the same socket, a single connection appears to cause more than one process to wake up.  One of them completes the connection and continues on its merry way, but the others that were awakened go off to never-neverland.  They end up hung, and a debugger trace indicates that they're deep in some libc call tree, apparently waiting for the rest of the TCP handshake that the process that worked consummated.  Oh yes, and Sun is completely uninterested in the phenomenon.

The way I worked around it was to allocate a semaphore, which I use to lock the accept().  That way, even if several children wake up from select() to accept() a new connection, only one at a time will have a chance to try.  Presumably, the first one will get the connection; accept() for the rest will fail (you have to set O_NDELAY on the socket to get it to fail immediately).

I've encapsulated the whole daemon thing into a pair of c++ classes that I could probably be convinced to part with, if you really want to go whole-hog.
0
barrycAuthor Commented:
I tried that on IRIX 5.3 and it worked great.  Thanks!
Wish me luck when porting this thing to Solaris 2.5 and
SCO....

0
dhmCommented:
Be careful on Solaris-2.5...I think that had the "multiple accepts cause grief" problem too.  (2.5.1 definitely has it, but 2.5 probably did too.)  If your parent just does accept-fork-accept-fork..., you're probably OK, but if you fork off a bunch of children as I described, you're in danger.  (The reason to pre-fork a bunch of children is that fork() is a fairly expensive call; it's better to do it before you have a client who's waiting for you to serve him.
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Programming

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.