fork() problem

Posted on 2007-07-30
Medium Priority
Last Modified: 2013-12-27

I have a C program runnig on Unix Solaris, doing a fork.
The problem is that fork() returns with a positive number but errno is set to 10 (ECHILD).
Then waitpid returns with the same error since, obviously, the child process was never really created.
Why doesn't fork() return with -1?
In all code examples I have seen, none handled such a case.

Question by:nivo_Z

Assisted Solution

ravs120499 earned 160 total points
ID: 19591726
First, what is the data type of the variable to which you assign the pid returned by fork()? If that is unsigned (watch out for implicit casts as well), then -1 will look like a (large) positive number.

If that is not an issue, then: is the fork() happening in a multi-threaded environment?

- Ravs

Author Comment

ID: 19591794
The type is pid_t and most of the time it is working ok, the problem happens only once in a while.
LVL 22

Assisted Solution

by:Brian Utterback
Brian Utterback earned 160 total points
ID: 19592588
What makes you think that the fork did not really create a child process? If the return code for
fork is positive, you should not look at errno. The errno variable is not cleared at each call,
so it's value is whatever the last error return in your program was unless -1 is returned. Many things that you do not know about inside of library calls and such can set errno, so you should never look at it unless an error was returned. Further evidence of this is the fact that ECHILD
is not an expected error from the fork call.

You do not say what the positive return value from fork is. Does it look like a reasonable pid
number? Is it fairly close to the pid of the parent? Have you tried using the truss command to
see if a new child process is created?

If you are using Solaris 10, you could use dtrace to try to see what is going on.

If the child process changes it's process group and then exits, you might be having a
race condition where the child exits before you call waitpid. If the child does
not change it's process group, then even if it did exit fast, the return status should
be available and the waitpid would not fail.

You might posting the code that you use to do the fork. Someone here might see something
in it.

Author Comment

ID: 19598841

I know that the child process was not created because after the fork, before the switch on the pid, I print a message to the log. This message should appear twice, once printed by the parent and once by the child. Most of the time I do get these 2 messages, but once in a while I get only the parent message.
Also, I set errno to 0 right before the fork.
Here is the relevant code:

   errno = 0;
   pid = fork ();
   printolog("%s: %s - fork return: %d. errno: %d", func_nm, (pid) ? "parent" : "child", pid, errno));

    signal(SIGCHLD, SIG_DFL);

   switch (pid) {
     case -1:
         /* code */
     case 0:   /* child */
     /* code */
     default: /* parent */
          r = waitpid (pid, &status, 0);
        /* some more code */
         }  /* end switch(pid) */

LVL 12

Accepted Solution

PCableGuy earned 180 total points
ID: 19696967
Hi nivo Z,

I might be wrong, but I'm wondering if the parent and the child is printing a message to same log file (which would make the log file a shared resource). If so, there might be a conflict sometimes between the parent and child process.

Featured Post

How to Use the Help Bell

Need to boost the visibility of your question for solutions? Use the Experts Exchange Help Bell to confirm priority levels and contact subject-matter experts for question attention.  Check out this how-to article for more information.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Introduction: Dialogs (2) modeless dialog and a worker thread.  Handling data shared between threads.  Recursive functions. Continuing from the tenth article about sudoku.   Last article we worked with a modal dialog to help maintain informat…
FreeBSD on EC2 FreeBSD (https://www.freebsd.org) is a robust Unix-like operating system that has been around for many years. FreeBSD is available on Amazon EC2 through Amazon Machine Images (AMIs) provided by FreeBSD developer and security office…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
In a previous video, we went over how to export a DynamoDB table into Amazon S3.  In this video, we show how to load the export from S3 into a DynamoDB table.
Suggested Courses
Course of the Month17 days, 6 hours left to enroll

864 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question