Core file analysis

Posted on 1998-03-19
Last Modified: 2012-08-14
Back in the "old" days, I used to analyze core dumps by printing them out and looking at them.
Now on our Unix system we have processes that core dump and since they were built optimized,
there is no symbolic information so a debugger is not very happy.  At a minimum I like to be able
to extract the location that caused the fault from the core file.  To get these points, you have to
provide a means to get ANY information from a core file produced by an optimized process.  I will
not accept an answer that say build it with debug.  If there is truely no way to extract information
from the core file, then I will withdraw the question.
Question by:jlargent

Expert Comment

ID: 2009011
I find your question somewhat confusing, possibly because I
*think* you are confusing several bits of terminology.  Let
me explain the terms; if you could then explain precisely
which ones apply to your problem, then I think I can help you.

"build with debug" in a Unix environment usually means either
to include a C preprocessor flag like -DDEBUG which turns on
programmer-supplied code (e.g. calls to printf()) that emit
debugging information OR to use the -g flag, which causes the
compiler to embed all kinds of information that debuggers
find useful.

"building optimized" means to supply the -O flag to the
C compiler, which causes various kinds of optimizations
to occur.  Depending on the compiler, this may or may not
interact with the -g flag: there are compilers to which
the expression "-g -O" is meaningful and produces a result
which combines both.

"no symbolic information" is usually the result of having
that information stripped out of the compiled executable at
load time or by programs like "strip" afterwards.  An executable
compiled with/without debugging, or with/without optimization,
may be stripped of its symbolic information: it's an independent
operation that happens last.

So using these terms, could you describe just which ones
apply to your current situation?

Expert Comment

ID: 2009012
Even in optimized code there's symbolic information available for 'ld' the linker.
Maybe I'm a spoiled idiot, but when I use my debugger (running AIX 4.umpty)
it allows me to run the (faulty, but optimized) code and it is able to give me
a stacktrace of 'active; functions at the time the thing crashed ... I know it's
not giving me the _exact_ location, but nevertheless ...

kind regards,

Jos aka

ps. AFAIK gdb can do the same ...

Expert Comment

ID: 2009013
If your using HPUX, xdb does a great job if you haven't used the strip command. As far as i know you need at least the symbolic information to retrieve some sensible information.

If you just have a core file, using the "freeware" program "coran" (Core Analyzer) can return some global information about the core file.
Free Tool: Postgres Monitoring System

A PHP and Perl based system to collect and display usage statistics from PostgreSQL databases.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.


Author Comment

ID: 2009014
In response to zonker's comments:
When I say built with debug, I mean the -g flag.  When I say optimized, I mean the -O flag.  The executables have NOT been stripped. To say no symbolic information was probably not completely accurate.  We are using DEC Alpha machines running DEC Unix 4.0

Accepted Solution

zonker031798 earned 500 total points
ID: 2009015
Okay, now I think I understand your situation.  The absence
of the information that's included when the -g flag is used
at compile-time will handicap most symbolic debuggers, but
the use of -O won't.  (Although it probably does mean that
the generated code in the executable is a bit more convoluted,
since the compiler is prone to do that in its attempts to
optimize.)  The good news is that the binaries aren't stripped.

This leaves you with several options.

1. You should have a copy of adb, the original debugger,
on your system.  The good news is that it can cope with
almost anything (including stripped binaries); the bad news
is that it has a line-by-line interactive and somewhat
cryptic interface.  However, it is pretty powerful, and
there are a lot of scripts around for it.  (For example,
SunOS ships with a bunch of kernel debugging scripts
in /usr/kvm/adb.)

2. You can attack the problem with gdb, the GNU debugger,
which will run (albeit suboptimally) on your binaries.
It's available from; you
probably want the most recent version available, which
I believe is 4.16.

3. You could also have a shot at debugging it with ddd, which
is a newer but GUI-based powerful debugger.  The home page
for it is at; I don't know
if it will compile under DEC Unix, because I haven't tried
that yet, but based on my experience with it under SunOS and
Solaris and a few other Unixes, I'd expect it to.

All of these debuggers should be able to provide you with some
basic information from the core dump, such as a stack backtrace,
that will enable you to figure out where and why the executable
died.  Beyond that, each one has different features that will
let you set breakpoints, modify variables/memory locations,
examine registers, start/stop execution, and so on.  If I were
in your shoes, I'd probably try gdb first, because adb is harder
to learn and ddd may be overkill for what you need.


Author Comment

ID: 2009016
Ok, I decided to try gdb, but I'm having problems and don't know if it is me or something
else.  I had our sys admin download the latest gdb and install it.  This is what I get:
 gdbnew /socc/Delivery/bin/OSF1V/PageController PageCore
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (alpha-dec-osf4.0), Copyright 1996 Free Software Foundation, Inc...

"/socc/Delivery/bin/OSF1V/PageController": could not open as an executable file: Invalid argument

/users/operator/PageCore: Invalid argument.

Turns out we have gdb on the system from a year or so ago.  When I try that version, I get
gdb /socc/Delivery/bin/OSF1V/PageController -c PageCore
GDB is free software and you are welcome to distribute copies of it
 under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (alpha-dec-osf3.2), Copyright 1996 Free Software Foundation, Inc...
Core was generated by `PageController'.
Program terminated with signal 11, Segmentation fault.
/usr/shlib/ No such file or directory.

warning: Hit beginning of text section without finding

warning: enclosing function for address 0x24a56ab0
This warning occurs if you are debugging a function without any symbols
(for example, in a stripped executable).  In that case, you may wish to
increase the size of the search with the `set heuristic-fence-post' command.

Otherwise, you told GDB there was a function where there isn't one, or
(more likely) you have encountered a bug in GDB.
#0  0x24a56ab0 in ?? () from /usr/shlib/

We are running DEC Unix 4.0 so the second version probably bombs from that.  But have
you seen the problems of the first version (i.e., invalid argument).  Did my sys admin guy
screw up or is this another result of the "fine" DEC product?

Featured Post

Networking for the Cloud Era

Join Microsoft and Riverbed for a discussion and demonstration of enhancements to SteelConnect:
-One-click orchestration and cloud connectivity in Azure environments
-Tight integration of SD-WAN and WAN optimization capabilities
-Scalability and resiliency equal to a data center

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Let's say you need to move the data of a file system from one partition to another. This generally involves dismounting the file system, backing it up to tapes, and restoring it to a new partition. You may also copy the file system from one place to…
Introduction Regular patching is part of a system administrator's tasks. However, many patches require that the system be in single-user mode before they can be installed. A cluster patch in particular can take quite a while to apply if the machine…
Learn how to find files with the shell using the find and locate commands. Use locate to find a needle in a haystack.: With locate, check if the file still exists.: Use find to get the actual location of the file.:
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.

821 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question