We help IT Professionals succeed at work.

Check out our new AWS podcast with Certified Expert, Phil Phillips! Listen to "How to Execute a Seamless AWS Migration" on EE or on your favorite podcast platform. Listen Now

x

Trapping delphi app hang

brenlex
brenlex asked
on
Medium Priority
724 Views
Last Modified: 2013-12-04
I have a multithreaded Winform App written in Delphi 5, which randomly (under load) is hanging.

I want to narrow down my search for the last procedure called (whether on main form or one of the threads) before the hangs (whilst running thru IDE debugger) but have been unable to do so -- the app hangs, but nothing is reported in the IDE. The IDE then reports its own error, requiring a restart.

I have tried with EurekaLog too, with no success.  Ultimately I have to kill the app using Task Mgr, with nothing reported.

Help. Any advice?
Comment
Watch Question

Geert GOracle dba
CERTIFIED EXPERT
Top Expert 2009

Commented:
what does the app do ?
how many threads ?
interaction between threads ?
data exchange between threads ? locked access to the data (critical section) ?

it's a bit needle in haystack

or add logging to what happens
developmentguruPresident

Commented:
 I have approached this problem a number of different ways.

  One way is with Logging.  Since you are dealing with multiple threads would need to be sure to either log the time (processor cycle count would be a good way) or make sure the logging is done in a way that is serialized (such as creating a separate service for the logging).  You then put a log entry on the entry and exit of each and every method on the form and in the thread(s).  This can be a time consuming exercise.  If you are able to look at your log and see that the last method was entered but never left, then you know where to look.

  Another approach I used when a random access violation would occur on exit of the program.  I called it "Binary deconstruction".  You remove a good sized portion of the forms in the application (about half), compile (making sure to remove references as needed), and run.  If you remove half the forms, but leave the form that is "causing" the issue and it no longer happens then you would know that it was an issue between that form and another.  If you remove half the forms and the issue continues then you have removed those forms from consideration as to what is causing the problem.  If you can get it down to the last form then you continue to remove half the code until the error no longer occurs.  This process makes the application point out to you where the problem is, but it is also time consuming.

  From personal experience it sounds like you are stuck in a loop that is eating up memory until you get an out of memory error.  You should start to look at the code that was changed before the error started.  Hopefully this only started happening recently.

  Experiment a bit and see what more you can learn.  Let me know if you need more.
Geert GOracle dba
CERTIFIED EXPERT
Top Expert 2009

Commented:
1 way of adding logging is using a profiler like prodelphi
http://www.torry.net/pages.php?id=1525

you don't need to do much, just install it, it puts a start and end code in every proc/func
and even gives the time spent in each
Needle in a haystack was my thought exactely... I don't have a solution either, but maybe a few useful tricks.

If by require restart you mean you need to restart the computer, try killing Delphi32.exe using taskmgr instead of killing your application. This will inherently kill your application aswell and you probably will not need to restart the computer.

What is the error message you get from Delphi? I assume you have looked at the StackTrace (ctrl+alt+s) at this point. If one of the threads is using alot of CPU you can probably catch what it is doing by breaking the program using F2 from the IDE. The debugger will pause the programm immediately and display the CPU window (which for the most part is not very useful). From here, close the CPU window and press F7, program execution will resume and the debugger will break again on the next line of Pascal/Delphi code that is executed. By randomly doing this you can get a good idea about where the program is spending alot of time. You can ofcourse also use a profiler too, but I usually try this first because most profilers will mess around with the code more than I like.

Author

Commented:
The problem I have with tools like Prodelphi and EurekaLog, is that they do not dump their findings to file unless the app is closed down in a controlled fashion, which I cannot achieve as the app is locking up.

When attempting to debug from the IDE and the hang occurs, I get inconsistent behaviour from the IDE with regards to being able to debug (sometimes an access violation in comctl32.dll, sometimes non-responsive - requiring IDE restart).

I decided to negate this inconsistency and opted to log to file on each processing thread -- and have found something very strange... if I have "optimisation" enabled in my compiler options, then the app truly just hangs (all threads stop) ... game over.  However with "optimisation" disabled, my threads continue to process in the background (polling db tables for queue) except the app gives the appearance that it has hung (form is unresponsive) but ONLY for a period of time.  In my main processing thread, I report to a log file with each iteration of my Execute method, to let me know it is still going -- on occassion however, it stops writing all together for a random amount of time (sometimes up to 5 mins), and then suddenly comes alive again !?!

Is there something I should be aware of with regard to my threads when optimisation is ON?

How can a thread stop processing for a random amount of time when they are never implicitly suspended?
developmentguruPresident

Commented:
In your logging check your memory availability.  If Windows believes it is running low on memory it will pause to swap memory content out to disk, freeing up memory.  This process is not only slow, but the juggling of memory to disk afterwards can also be very time consuming.  Memory may not be the issue but it is a good resource to target first.  Have you been able to determine which method it stops in during these lapses?

An important thing to keep in mind is this: You are seeing multiple issues.  For now I would continue to run the program without optimization until the other problems have been addressed.  Only when the rest of it is running like you believe it should would I go back and try to re-enable optimization and see what the effects are.
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview

Author

Commented:
Correction -- the background threads only continue to write to their dedicated log files when I run the app from the IDE debugger.  The main processing thread still appears to pause for an indeterminable period of time, before jumping back into life.  If I run the app as a standalone exe, all threads stop reporting to files and the app has officialy hung.

Unfortunately there is just too much code to post, but I believe I have found the culprit, though I am not sure WHY it is occuring...

Essentially any of my threads can call a globally declared MyDBService object.  This unit contains methods to pull specific info back from the MySQL DB.  ALL of MyDBService's  methods contain critical sections (see code snippet).

fMyTable represents an instance of a class which encapsulates access to a particular table, and is created/destroyed in the TMyDBService constructor/destructor respectively.  When I put an exception handler into the MyTable class's GetValByMyCode I have discovered that I randomly experience  
"Access violation at address 10011F04 in module 'libmySQL.DLL'. Write of address FFFFFF04" -- from there on in, the app starts to disintegrate.

Methods within the forementioned global instance of MyClass are hit from all threads, and valid values are passed in (prmCode) always.

Are there other rules I should be aware of when implementing critical sections?

function TMyDBService.ValFromMyCode(prmCode: string): string;
begin
EnterCriticalSection(fLockSection);
  try
    Result := fMyTable.GetValByMyCode(prmCode);
  finally
         LeaveCriticalSection(fLockSection);
  end; {try..finally}
end;

Open in new window

developmentguruPresident
Commented:
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview
Geert GOracle dba
CERTIFIED EXPERT
Top Expert 2009

Commented:
do your threads use this database object all at the same time ?
and this prmcode is allways protected for writing to too ?

Author

Commented:
developmentguru -- yes, the Synchronize method is used in all instances of display update.

Geert -- yes, the threads will call methods contained in this global db object at the same time, but probably not the same inidividual method.  I thought the use of a critical section is approriate in itself, unless there is something else I need to consider?  How do I protect prmcode? I thought each call to [ValFromMyCode] would create its own instance of prmcode on the stack anyway...no?
Geert GOracle dba
CERTIFIED EXPERT
Top Expert 2009
Commented:
Unlock this solution with a free trial preview.
(No credit card required)
Get Preview

Author

Commented:
Put down to fault in third party components.
Unlock the solution to this question.
Thanks for using Experts Exchange.

Please provide your email to receive a free trial preview!

*This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

OR

Please enter a first name

Please enter a last name

8+ characters (letters, numbers, and a symbol)

By clicking, you agree to the Terms of Use and Privacy Policy.