?
Solved

C++ console dataloading application hangs randomly

Posted on 2007-08-09
15
Medium Priority
?
431 Views
Last Modified: 2013-11-25
I've created several dataloading applications in C++, they read data from a file or from a database and then load the data into a SQL 2000 database using ado.  They are running at several different sites.  At one site several of the programs hang randomly on the windows 2003 server they're running on.  They will run fine all day (they are called every minute or so using a cron - scheduling program) then in the afternoon or some other time they'll hang.  You can see the process in task manager but it isn't doing anything.  The job scheduler detects that it is running and doesn't call it again - thus preventing further data from loading.  

How can I determine the cause of the problem or determine where/why it is hanging?  Is there a way to set up a utility or something that will capture the state of the process when it hangs?  How can one approach debugging a hanging program in general?

0
Comment
Question by:sloney4141
13 Comments
 
LVL 86

Assisted Solution

by:jkr
jkr earned 668 total points
ID: 19663570
One thing that you can do to tackle the issue is launching your app to run under the DependencyWalker's (www.dependencywalker.com) profile mode (it has a bunch of command line options, check out those), this will allow you to at least find out where it hangs.

As an intermediate solution, you could add a watchdog thread that terminates your app when it does not issue some hearbeat signal over a predefined period of time, which would at least allow you to restart it and continue.
0
 

Author Comment

by:sloney4141
ID: 19664464
Thanks jkr,

I will look into dependencywalker further.  There was talk of linking these processes into a monitoring tool that we have already and restart the apps if they hang.  In either case I would like to know why they're hanging so I can prevent such issues in the future.
0
 
LVL 14

Assisted Solution

by:wayside
wayside earned 664 total points
ID: 19670626
Install the Sysinternals Process Explorer (http://www.microsoft.com/technet/sysinternals/ProcessesAndThreads/ProcessExplorer.mspx) and when the process hangs, examine the properties of the process. The Threads tab will give you some idea of where the process is hung; you will probably need to generate a map file to look up the addresses to see what function you are in. You can also look at all the handles that are open, you might get a clue there.

The next best thing is to add logging to your program so you can trace it's execution. BY examining the log file you can see where it is getting stuck.

Without seeing any code it is difficult, but if I had to speculate I would guess a database  call is hanging (perhaps the database server is unavailable or swamped at these times?) or a database error isn't being properly handled and you wind up hung.
0
VIDEO: THE CONCERTO CLOUD FOR HEALTHCARE

Modern healthcare requires a modern cloud. View this brief video to understand how the Concerto Cloud for Healthcare can help your organization.

 

Author Comment

by:sloney4141
ID: 19671300
Thanks for the suggestion wayside, I do have Process Explorer on that machine but I didn't realize it would provide me with any good info on hanging processes so I'll try that.  Recently I have placed some logging into the program and I believe it's hanging when it tries to exit.  If this is the case then it may be hanging within the OS code that cleans up the process.  If that is the case I'm really going to be stumped on how to resolve this.

Here is the main function definition:

  VOID main( int argc, char * argv[ ], char * envp[ ] )
  {

I then have the following:

  try
  {
   if ( !GetConfig( sSite ) )
{
                  printf( "GetConfig( %s ) failed.\r\n", sSite );
                  return;
            }      


0
 

Author Comment

by:sloney4141
ID: 19671356
Thanks for the suggestions guys, I know it can't be fun trying to help someone with such a difficult to diagnose problem without being hands-on.  

wayside:
I do have Process Explorer on that machine but I didn't realize it would provide me with any good info on hanging processes so I'll try that.  Recently I have placed some logging into the program and I believe it's hanging when it tries to exit.  If this is the case then it may be hanging within the OS code that cleans up the process.  If that is the case I'm really going to be stumped on how to resolve this.  Anyway, I'm not sure it will help but here is some of the code:

The main function definition:

  VOID main( int argc, char * argv[ ], char * envp[ ] )
  {

I then have some initializatuion and then the following:

  try
  {
                     if ( !GetConfig( sSite ) )
      {
           printf( "GetConfig( %s ) failed.\r\n", sSite );
           return;
      }      
      
      if ( !LoadMapFile( sMapFilePath ) )
      {
           printf( "LoadMapFile(%s) failed.\r\n", sMapFilePath );
           return;
      }      

Then several other calls and then at the end:

}  //end of try

catch (...)
{
      printf("An Unknown Exception occurred.  Please contact support.\r\n");
}
if ( pLine )
      free( pLine );
if ( pLineTemp )
      free( pLineTemp );
return;
}

If I create a map file and use a utilty to get a dump on the process, how would I track through the dump?  Are there tutorials or something on this.  My assembler is rather rusty.  Thanks again for the help.
0
 

Author Comment

by:sloney4141
ID: 19671378
Ug, sorry for the double post =(, ignore the first one.
0
 
LVL 86

Expert Comment

by:jkr
ID: 19671427
Using a map file works as follows: Consider e.g.

#include <windows.h>
#include <stddef.h>
#include <stdio.h>

LONG
WINAPI
ExceptionHandler(LPEXCEPTION_POINTERS pe) {

    char acModule[MAX_PATH];

    MEMORY_BASIC_INFORMATION mbi;
    HMODULE hMod;

    VirtualQuery (pe->ExceptionRecord->ExceptionAddress,&mbi,sizeof(mbi));

    ptrdiff_t RVA = (char*)pe->ExceptionRecord->ExceptionAddress - (char*)mbi.AllocationBase;

    hMod = (HMODULE) mbi.AllocationBase;

    GetModuleFileName(hMod,acModule,sizeof (acModule));

    printf( "Detected Exception in %s at RVA 0x%08X\n", acModule, RVA);

    return EXCEPTION_EXECUTE_HANDLER;
}

void FaultingFunction () {

    LONG* p = NULL;
    *p = 42;
}

void main(){

    SetUnhandledExceptionFilter (ExceptionHandler);
    FaultingFunction ();
}

(compiled with "cl rvaxcept.cpp /link /map")

which prints

Detected Exception in C:\tmp\cc\rvaxcept.exe at RVA 0x00001088

The map file is

rvaxcept

Timestamp is 44660958 (Sat May 13 18:29:12 2006)

Preferred load address is 00400000

Start         Length     Name                   Class
0001:00000000 00004938H .text                   CODE
[...]

 Address         Publics by Value              Rva+Base     Lib:Object

0001:00000000       _ExceptionHandler@4        00401000 f   rvaxcept.obj
0001:0000007a       _FaultingFunction          0040107a f   rvaxcept.obj
0001:00000092       _main                      00401092 f   rvaxcept.obj
0001:000000a7       _printf                    004010a7 f   LIBC:printf.obj
0001:000000d8       _mainCRTStartup            004010d8 f   LIBC:crt0.obj
0001:000001b7       __amsg_exit                004011b7 f   LIBC:crt0.obj
[...]

You can see the 'Rva+Base' column, base is given as 'Preferred load address is 00400000'

Add that to the faulting module RVA of 0x00001088 and you get 0x00401088, then look that up in the above (i.e. the 'nearest symbol' and you can see that it is 'FaultingFunction'

0001:0000007a       _FaultingFunction          0040107a f   rvaxcept.obj
0001:00000092       _main                      00401092 f   rvaxcept.obj

The address is between main and FaultingFunction, which starts before 0x00401088, the next function is main and starts later.
0
 

Author Comment

by:sloney4141
ID: 19709066
Ok, I'm working on trying to get support to install the program running under the dependency walker.  I'm going to put some debugging stuff in as well.  I'm also going to try to find a tutorial or something on how to get a memory dump and use it to determine what's going on in a process.  
0
 
LVL 86

Expert Comment

by:jkr
ID: 19718942
0
 
LVL 86

Expert Comment

by:jkr
ID: 19718950
0
 

Author Comment

by:sloney4141
ID: 19719924
Ok I'll check those out as well.  Thanks jkr
0
 
LVL 49

Accepted Solution

by:
DanRollins earned 668 total points
ID: 19766084
This is crude, but it seems possible to use the VC++ debugger:  When the app has hung, you can bring up VC++ IDE, use

  Build / Start Debugging / Attach to process

Then click the Pause button (Debug/Break).  The call stack may contain some useful info.  Also the Debug/Modules list might provide a clue or two.  You could also try stepping through the code in hopes of getting to something you recognize.

If you run a Debug build when you try this, you'll get a lot more useful info.

-- Dan

0
 
LVL 86

Expert Comment

by:jkr
ID: 23214799
I'd like to object to a deletion since sevreal solutions were given to address issues like this.
0

Featured Post

Free Tool: ZipGrep

ZipGrep is a utility that can list and search zip (.war, .ear, .jar, etc) archives for text patterns, without the need to extract the archive's contents.

One of a set of tools we're offering as a way to say thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Basic understanding on "OO- Object Orientation" is needed for designing a logical solution to solve a problem. Basic OOAD is a prerequisite for a coder to ensure that they follow the basic design of OO. This would help developers to understand the b…
The SignAloud Glove is capable of translating American Sign Language signs into text and audio.
The goal of the video will be to teach the user the difference and consequence of passing data by value vs passing data by reference in C++. An example of passing data by value as well as an example of passing data by reference will be be given. Bot…
The viewer will learn how to clear a vector as well as how to detect empty vectors in C++.

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question