Real vague question on a linux core

Hi All

I am on RedHat Linux advanced server. My program recieves and responds to multiple clients using sockets. Program has been built for concurrent conection requests.
(This program is a huge 500K lines of code  and works on Linux, HP, solaris and windows)


Now,

As long as only one client is sending the requets (consider it as oad teting with one user) - program works fine. as soon as the load is increased to two simultaneous clients, program crashes!!!!!!!!!!!!!

(Now you'll say - so??? solve it)..

Ofcourse I am trying to solve it but :

1. Backtrace of the core file is not consistant. Ihave atleast 20 different backtraces.
2. If code is in debug mode, crash does not happen.
2. All the back traces have atlease one function common. Say funca(). As soon as I start putting printfs to get good idea about the crash location, crash does not happen.

So the question is : How do I solve this issue?

I am on linux advanced server 2.1 My code has been compiled using gcc 2.96.

Any kind of help/pointers would be appreciated.

Thanks
-Ajay

avi_indiaAsked:
Who is Participating?
 
CetusMODConnect With a Mentor Commented:
PAQed, with points refunded (500)

CetusMOD
Community Support Moderator
0
 
sunnycoderCommented:
The information you have so far provided is really too less but this is most likely a problem with signals or other IPC ... alternatively you could be having several bugs instead of just one

Are you doing some sort of signalling ... Try building signal handler for all catchable signals and print the signal number and originator ... you will have to use sigaction and not signal interface for this
0
 
brettmjohnsonCommented:
In my experience, you are likely experiencing one of these three problems:

1) You have a multithreaded program with unprotected (non-threadsafe)
access to one or more shared data items.  This could even be as a result
of uses of non-threadsafe library calls (like strtok).  Adding printf statements
injects console i/o into the flow of control.  the i/o triggers a thread context
switch, possibly rearranging the context switches such that you avoid concurrent
access.


2) You are corrupting the stack. Probably in funca() or one of the functions it
calls.  Again, adding printf() statements changes how the stack is utilized.


3) There is a compiler optimization problem.  If running in "debug mode" means
you  compiled with -g -O0, rebuild with -g and your normal optimization level.
this will give you symbols into the the offending code (although linenumbers might
not match up).  Then see if the debugger traps.


0
The 14th Annual Expert Award Winners

The results are in! Meet the top members of our 2017 Expert Awards. Congratulations to all who qualified!

 
Karl Heinz KremerCommented:
Take a look at Valgrind (http://valgrind.kde.org/) to resolve any stack (and in general memory) related problems. The advantage of Valgrind is that you don't have to modify your code, so you can run it on the app that crashes.

Also, Helgrind (part of the Valgrind suite of tools) allows you to find potential race conditions in multithreaded apps.
0
 
ahoffmannCommented:
> If code is in debug mode, crash does not happen.
in 99.9xx% this is a problem with allocated memory, pointer problem, memory leak, whatever ..
I'd go vi Valgrind, Mpatrol, Electric Fence ...
  http://www.cbmamiga.demon.co.uk/mpatrol/
  http://perens.com/FreeeSoftware
  http://valgrind.kde.org/
0
 
avi_indiaAuthor Commented:
Thanks guys.. I m trying to use valgrind. Will update the status as soon as i find something.

(only problem which I can forsee is that - my code also uses smart heap. valgring might clash with this)
0
 
avi_indiaAuthor Commented:
Nothing helped here. bug has been marked as deffered for next release of the product.
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.