analysing core with gdb

I have a program that  dumps  . The core provides the following stack  trace with gdb

Program terminated with signal 11, Segmentation fault.

warning: current_sos: Can't read pathname for load map: Input/output error

Reading symbols from /binlib/vndr/sybase/lib/libct_r.so...done.
Loaded symbols for /binlib/vndr/sybase/lib/libct_r.so
Reading symbols from /binlib/vndr/sybase/lib/libcs_r.so...done.
Loaded symbols for /binlib/vndr/sybase/lib/libcs_r.so
Reading symbols from /binlib/vndr/sybase/lib/libcomn_r.so...done.
Loaded symbols for /binlib/vndr/sybase/lib/libcomn_r.so
Reading symbols from /binlib/vndr/sybase/lib/libintl_r.so...done.
Loaded symbols for /binlib/vndr/sybase/lib/libintl_r.so
Reading symbols from /binlib/vndr/sybase/lib/libsybtcl_r.so...done.
Loaded symbols for /binlib/vndr/sybase/lib/libsybtcl_r.so
Reading symbols from /lib/libdl.so.2...done.
Loaded symbols for /lib/libdl.so.2
Reading symbols from /lib/tls/libpthread.so.0...done.
Loaded symbols for /lib/tls/libpthread.so.0
Reading symbols from /usr/lib/libLiS.so...done.
Loaded symbols for /usr/lib/libLiS.so
Reading symbols from /usr/lib/libstdc++.so.5...done.
Loaded symbols for /usr/lib/libstdc++.so.5
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
#0  0x55623ce1 in comn_take_mutex ()
   from /binlib/vndr/sybase/lib/libcomn_r.so
(gdb) where
#0  0x55623ce1 in comn_take_mutex ()
   from /binlib/vndr/sybase/lib/libcomn_r.so
#1  0x556763b1 in build_masks ()
   from /binlib/vndr/sybase/lib/libsybtcl_r.so
#2  0x556770f6 in select_thread ()
   from /binlib/vndr/sybase/lib/libsybtcl_r.so
#3  0x556bc9dd in start_thread () from /lib/tls/libpthread.so.0
#4  0x55864ffa in clone () from /lib/tls/libc.so.6


It does not provide any data that can I can use .
I tried running the program with gdb but it  never dumps then.

Another thing , I am not sure it is related ot not ('select_thread'  is common in the dump and the error message)  , but I get a strange message when I run my program.
select_thread: pipe() failed
: No such file or directory

 I have no idea where this comes from , it  does not affect the execution of the program but it  started appearing recently  although there was no
significant changes to the program .The program was running for a pretty liong time without problems .

Please let me know if anybody has any idea about this.

Thanks
Santosh











san_adiAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

jlevieCommented:
Your program is crashing while manipulating threads, according to the stack backtrace. That probably means that the startup error is probably significant.

Was this program written some time ago? If so has it been updated to be in accordance with the Linux implementation of POSIX threads?
0
san_adiAuthor Commented:

 The program is not so old.  It was migrated to linux about a year back.

 I just made minor changes ( some changes in the structures being used  )  and rebuilt it .

 How can I debug this, since the core does not provide any information as to where it is crashing.

 The startup error might not be linked  for the crash but I would like to know what it is if anybody knows about it.

Thanks
Santosh
0
san_adiAuthor Commented:


 Sorry Jelvie I thouoght you  meant that the startup error was not  significant.

 So I need to trace out where that is comming from. I am not using pipes anywhere in my application.
 So it must be using it internally . From  the  error message  ( No such file order directory) it looks  that it is looking for some name from somewhere in the system which it does not find.

Any ideas what , where could that be ?

Also   I do not get the error when I run the program  on  kernel  2.4.~   but I get it when I run it on  kernel 2.6.~   (Suse)
I did not get it to crash on 2.4~ either.


Thanks
Santosh  
 







 
0
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

jlevieCommented:
My suspicion is that the reference to a pipe is a red herring and error is just that something about your thread usage in the program is wrong. A stock 2.4 kernel would not have the Posix threads in the kernel, but a 2.6 kernel would. If the application isn't coded strictly according to the man pages for POSIX threads you could see a difference in behaviour between the two. I'd suggest examining the code carefully to make sure that it is complaint with the POSIX thread spec.
0
san_adiAuthor Commented:

I found out he system call  due to which I get this error


The strace shows that the program tries to do    open("/dev/fifo.0", O_RDWR|O_NONBLOCK [wait(0x57f) = 20900

and then it tries to  to get information on from this (old_getrlimit(RLIMIT_NOFILE,    ) where it fails and gives me the  error.

In kernel 2.4~  I can see /de/fifo.0  file but it is not in 2.6~.
 
What is this file ? and why would an application try to open it ?
If this  is not supported in kernel 2.6~ , what is the replacement for this ??

Thanks
Santosh
0
jlevieCommented:
As far as I know /dev/fifo.0 would not be a standard device for a 2.4 kernel. The very name of the device implies that it is a Linux fifo and that is something that would be created for an application to open (see 'man mkfifo').
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Linux OS Dev

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.