What tool should I use to analyze a crash on WIn8?

Hi,

I have am old style MFC C++ application (created around 2003) that has been upgraded to compile and build  in 2005 (a few dlls) and and the majority is compiled and built in VS 2012.

This application works fine on win7 but crashes on win 8. I am trying to track this down but I do not want to install vs 2012 because it adds extra libraries that might change the symptoms.

I used to use a tool called "Dr Watson" (yeah, I know-old school :-) ).

Is there some newer equivalent foe win8?

Thanks,

Chris SChene
Christopher ScheneSystem Engineer/Software EngineerAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
Since this is your own application, my first step would be to install WinDbg. That is a downsized MS debugger, often used to analyze kernel crashs (BSODs). It does not install additional DLLs, and I use it to debug my C++ MFC applications on customer sites. The only prerequisit for being able to see your own code is to have the source available remote, and having build the runtime with a PDB to keep some debugging info.
JohnBusiness Consultant (Owner)Commented:
In addition to the above, look at Action Center, Reliability Monitor. It usually tracks crashes and "stopped working" errors. Look at the Monitor after a crash and let us know what the error is.
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
I downloaded a large number of tools in the package from Microsoft.

Is there a way to force a dump rather than crash and then dump the state of the application into a file.(essentially intercept the crash and dump instead)?

There is a dump analyzer in the software I downloaded.

I had another "mystery crash" with this application and I tracked it down to an initialized variable. It turned out that for some reason the uninitialized variable was always being set to a value of "0" when I was running on XP or win7 but not on win8. The bug had been there for 12 years and had never been identified until we ran on win8.  

There seem to be some differences in the memory structure in win8 or maybe it is vs2012.
IT Pros Agree: AI and Machine Learning Key

We’d all like to think our company’s data is well protected, but when you ask IT professionals they admit the data probably is not as safe as it could be.

Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
You mean something like SysInternals ProcDump at https://technet.microsoft.com/de-de/sysinternals/dd996900, which allows to get a "live" dump?
evilrixSenior Software Engineer (Avast)Commented:
Why not just remote debug with Visual Studio? All the better if you have access to vmware or virtual box and so can reproduce this on your Dev machine in a vm.  

https://msdn.microsoft.com/en-us/library/y7f5zaaa.aspx
gheistCommented:
nirsoft bluescreen is quickest.
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
"Why not just remote debug with Visual Studio? " I was looking though the documentation and it says this can only be done with managed code which I assume means .net. This application outputs native C++ code.
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
I've performed VS Remote Debugging, but not with VS 2012 yet. But it should still be available for unmanaged code.
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
This may take me till the next weekend to figure out so if you don't see any updates for a few days I am not abandoning the question.

I want to capture the dump when it crashes. It looks like WinDbg is the easiest and I have used that before to debug browser code. I may need some help getting it working.
evilrixSenior Software Engineer (Avast)Commented:
>> it says this can only be done with managed code
I've used it to debug unmanaged code before. You just choose remote debugger from the menu. You just need to remote monitoring agent installed and a copy of Visual Studio Pro (you can't do it with express or standard.

https://msdn.microsoft.com/en-us/library/8x6by8d2.aspx
evilrixSenior Software Engineer (Avast)Commented:
>>  It looks like WinDbg is the easiest
You're joking right? Don't get me wrong, I love and use WinDBG a lot but one adjective I'd never used to describe it is easy! :)
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
"In the land of the blind, the one-eyed man is king." If you have to choose between nothing and installing WinDbg, the latter is "easy" :D. Besides - how do you use Remote Debugging via TeamViewer? :p
evilrixSenior Software Engineer (Avast)Commented:
As I said before, the best way to deal with this is to try and reproduce it in a Dev environment  via a virtual machine.
sarabandeCommented:
as the sources are yours you could add exception handling and logging to narrow down the statement which causes the crash:

- switch-on 'enable C++ Exceptions with SEH Exceptions (/EHa)' for all projects and configurations
  in older projects you may add /EHa directly to commandline if you don't find the option in the settings
  exception handling with SEH allow to catch exceptions like 'access violation' which most probably is the case here
- add try catch block into all top functions like main, YourApp::InitInstance, major dll functions, ...
- use catch(...) and add log message to catch block.
   you may throw the error after that since the program is not able to recover from access violation (in almost all cases).
- log messages should open a logfile with append mode and close the file after each log.
- add log messages to begin of a function and to exit paths and catch blocks.
- if the program doesn't crash at start but after some doing, you should make the logging to get switched on-off
  you could do that in the gui by implementing a short-cut like ctrl-shift-l and a class like
class EXP_IMP_SPECIFIER Logging
{
      // private constructor makes it a singleton
      Logging() m_bLogOnOff(false) {}
      bool m_bLogOnOff;
      static Logging & GetInst() { static Logging theLogging; return theLogging; }
public:
      static bool IsLogOn() {   return GetInst().m_bLogOnOff; }
      static void LogOnOff(bool logOnOff) {  GetInst().m_bLogOnOff = logOnOff; }
};

Open in new window


you would put this into a header file of a low-level dll which can be called by all functions of your application and define the EXP_IMP_SPECIFIER either as '_declspec(dllexport)' for the dll project and '_declspec(dllimport)' for all other projects.

of course you can also add the log function to the class.

in your dialog or view class handle the short-cut (for example in PreTranslateMessage member function) and call
Logging::LogOnOff(!Logging::IsLogOn());

Open in new window


to toggle logging.

on crash you would check the log file which function was entered and not left. then you can add more log statements to narrow down.

note, though the above is some work it would help you with all issues that cannot reproduced in debug mode or at all platforms.

Sara

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
I just want to assure you experts I did not abandon this question.  I only work his project as part-time so I mostly work pm weekends.

I am trying this link now  : https://msdn.microsoft.com/en-us/library/8x6by8d2.aspx

In attempt to remote debug, I am using ms vs 2012 professional
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
My program is crashing ONLY on win8 and it seems to always be in ntdll.dll
Qlemo"Batchelor", Developer and EE Topic AdvisorCommented:
If you perform remote debugging via VS, you usually do that with the debug build, and have all debugging libs with their symbol info available. Ergo you should see the exact function call the crash happens in. Though, it might be necessary to use the Micorosft Symbol Server to get symbol info for W8 on your dev machine.
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
"If you perform remote debugging via VS, you usually do that with the debug build"   - Unfortunately the symptoms are different when I use the release vs the debug build.
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
I actually tracked down a crash using the debugger to this line of code. The critical section referenced is null (0). but this code is in a system file  named afxmt.inl

Since my code does  not use a critical section explicitly this must be something that windows code is creating or expecting to be created.,

_AFXMT_INLINE BOOL (::CCriticalSection::Lock())
{      
      ::EnterCriticalSection(&m_sect);

      return TRUE;
}

and this is called from wincore.cpp line below

if (g_RenderTargets.Lookup(this, pRenderTarget))
      {
            ASSERT_VALID(pRenderTarget);
            g_RenderTargets.RemoveKey(this);
            delete pRenderTarget;
      }
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
I do believe this is a memory corruption issue.  It is so nice that I don't have to worry about the memory corruption issues as much in java or .net.
sarabandeCommented:
the EnterCriticalSection only crashes if the member CCriticalSection::m_sect is invalid, not initialized, or already deleted.

in the debugger you should look at the stack and analyze the m_sect. the stack will show you whether there is one of your own functions involved before which - for example - makes a call with a NULL pointer. looking at the m_sect you also might see whether it looks like corrupt data or not.

I do believe this is a memory corruption issue.
evaluate all the functions of the stack before and look at the arguments and 'this' pointers of class member functions. it is very likely that you encounter a null pointer or a pointer with values between 10 and 1000 what is an invalid pointer value caused by using uninitialized variables or by writing beyond memory boundaries.

Sara
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
Thanks Sara> There seem to be more than one issue and I was able to track one of the issues down with the debugger to a single area code that is quite large and I still must analyze. The particular code in question processes a Splash screen and I have not determined why it is related to the crash but eliminating its function (not calling) eliminates my critical section crash. The splash screen is called very early in the program life cycle and I assume it must be corrupting something early that is not symptomatic until I see the critical section crash.

The other crash is more elusive:  The crash does not happen in the debug version of the code but only in the release version.

As a way of checking the "this" pointers I was thinking of adding an integer as a member of each class and simply doing a check  in each method to validate the this pointer.  Or, is there perhaps a built in way to do this in VS 2012? I think this would be less of a problem in .net code.
gheistCommented:
Yes, it is memory corruption issue. If crash was in vmware components means you used to trace would point finger in right direction.
sarabandeCommented:
The crash does not happen in the debug version of the code but only in the release version.
that's normal. the debugger initializes all variables while the release does not. moreover, the debugger adds extra space for allocations what leads to a much different allocation pattern of the heap manager. in case of memory corruption this often is crucial.

As a way of checking the "this" pointers I was thinking of adding an integer as a member of each class and simply doing a check  in each method to validate the this pointer.

if 'this' is null, the wrong call already has happened. you better check each pointer before calling or try to avoid pointers wherever possible. also switching-on SEH exceptions and adding a try-catch block to each of your functions will help to locate the wrong function. if 'this' has a small integer value this also mostly is due to null pointers of a parent structure where then member pointers were corrupted with non-pointer values.

note, the critical section call is not the cause but the consequence of a memory corruption. in a multi-threaded environment critical sections are necessary to make some code exclusive. when the resource handle used for the critical section was corrupted this call fails with a crash.

Sara
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
You are correct about the try-catch blocks. this is very old code and those are not used in all places. I am going to add them.

Thanks,

Chris
Christopher ScheneSystem Engineer/Software EngineerAuthor Commented:
I am going to "bite the bullet" and just put try-catch blocks around the code in this application.

This is old code and most of it does not use try-catch.

Sigh.

Thanks for the help all.

I ended up using the debugger actually installed on the win8 VM .
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
System Programming

From novice to tech pro — start learning today.