I had a process which dumped core, not because of any "fault" (segmentation fault, illegal instruction etc.) but because I "told" it to do so by sending the SIGQUIT signal.
Now, I'm trying to "restart" the process from the state it was in when the core was dumped.
I try to do this as follows:
1) There is a process (say P) which reads in register values, memory dumps etc. from the core file.
2) P does a fork(), the child does a PTRACE_TRACEME before exec()ing the original executable
3) P then stops the child at every instruction (PTRACE_SINGLESTEP) and waits for the child to reach the address of main() [P checks the eip register of the child after every instruction].
4) Now that the child is stopped at main(), P modifies the address space of the child with values that were there in the core file. This includes the stack (which is also present in the core file)
5) P also sets the registers of the child to the values as found in the core file
6) P does a PTRACE_DETACH
In theory I believe this should have been sufficient to get the process to restart in the state it was in when core was dumped.
Unfortunately, when I try this the child executes many instructions but eventually encounters a segmentation fault and dumps core (this time it wasn't intentional!!). It seems that this fault is encountered the first time the child executes an instruction in its "own" memory area (address 0x804ba84). This address seems to be invalid.
Does anyone have any ideas about possible flaws in the method stated above? Or what could be causing the child to go to an invalid instruction??
I searched for information on restarting from the core but most people seem to suggest that you can't restart using the core file, just use the core to examine faults etc. There is a Solaris utility called undump() but that seems to restart from main() and thus only restores values of global variables.
A basic question is that considering that the core file DOES have the stack, WHAT prevents us from being able to restart the process from the exact point it was when the core was dumped?
I apologize for any vagueness in the description above, but any help will be greatly appreciated.