Still celebrating National IT Professionals Day with 3 months of free Premium Membership. Use Code ITDAY17


Trapping delphi app hang

Posted on 2009-07-06
Medium Priority
Last Modified: 2013-12-04
I have a multithreaded Winform App written in Delphi 5, which randomly (under load) is hanging.

I want to narrow down my search for the last procedure called (whether on main form or one of the threads) before the hangs (whilst running thru IDE debugger) but have been unable to do so -- the app hangs, but nothing is reported in the IDE. The IDE then reports its own error, requiring a restart.

I have tried with EurekaLog too, with no success.  Ultimately I have to kill the app using Task Mgr, with nothing reported.

Help. Any advice?
Question by:brenlex
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 4
  • 4
  • 3
  • +1
LVL 38

Expert Comment

by:Geert Gruwez
ID: 24785020
what does the app do ?
how many threads ?
interaction between threads ?
data exchange between threads ? locked access to the data (critical section) ?

it's a bit needle in haystack

or add logging to what happens
LVL 21

Expert Comment

ID: 24785079
 I have approached this problem a number of different ways.

  One way is with Logging.  Since you are dealing with multiple threads would need to be sure to either log the time (processor cycle count would be a good way) or make sure the logging is done in a way that is serialized (such as creating a separate service for the logging).  You then put a log entry on the entry and exit of each and every method on the form and in the thread(s).  This can be a time consuming exercise.  If you are able to look at your log and see that the last method was entered but never left, then you know where to look.

  Another approach I used when a random access violation would occur on exit of the program.  I called it "Binary deconstruction".  You remove a good sized portion of the forms in the application (about half), compile (making sure to remove references as needed), and run.  If you remove half the forms, but leave the form that is "causing" the issue and it no longer happens then you would know that it was an issue between that form and another.  If you remove half the forms and the issue continues then you have removed those forms from consideration as to what is causing the problem.  If you can get it down to the last form then you continue to remove half the code until the error no longer occurs.  This process makes the application point out to you where the problem is, but it is also time consuming.

  From personal experience it sounds like you are stuck in a loop that is eating up memory until you get an out of memory error.  You should start to look at the code that was changed before the error started.  Hopefully this only started happening recently.

  Experiment a bit and see what more you can learn.  Let me know if you need more.
LVL 38

Expert Comment

by:Geert Gruwez
ID: 24785169
1 way of adding logging is using a profiler like prodelphi

you don't need to do much, just install it, it puts a start and end code in every proc/func
and even gives the time spent in each
Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.


Expert Comment

ID: 24785319
Needle in a haystack was my thought exactely... I don't have a solution either, but maybe a few useful tricks.

If by require restart you mean you need to restart the computer, try killing Delphi32.exe using taskmgr instead of killing your application. This will inherently kill your application aswell and you probably will not need to restart the computer.

What is the error message you get from Delphi? I assume you have looked at the StackTrace (ctrl+alt+s) at this point. If one of the threads is using alot of CPU you can probably catch what it is doing by breaking the program using F2 from the IDE. The debugger will pause the programm immediately and display the CPU window (which for the most part is not very useful). From here, close the CPU window and press F7, program execution will resume and the debugger will break again on the next line of Pascal/Delphi code that is executed. By randomly doing this you can get a good idea about where the program is spending alot of time. You can ofcourse also use a profiler too, but I usually try this first because most profilers will mess around with the code more than I like.

Author Comment

ID: 24794623
The problem I have with tools like Prodelphi and EurekaLog, is that they do not dump their findings to file unless the app is closed down in a controlled fashion, which I cannot achieve as the app is locking up.

When attempting to debug from the IDE and the hang occurs, I get inconsistent behaviour from the IDE with regards to being able to debug (sometimes an access violation in comctl32.dll, sometimes non-responsive - requiring IDE restart).

I decided to negate this inconsistency and opted to log to file on each processing thread -- and have found something very strange... if I have "optimisation" enabled in my compiler options, then the app truly just hangs (all threads stop) ... game over.  However with "optimisation" disabled, my threads continue to process in the background (polling db tables for queue) except the app gives the appearance that it has hung (form is unresponsive) but ONLY for a period of time.  In my main processing thread, I report to a log file with each iteration of my Execute method, to let me know it is still going -- on occassion however, it stops writing all together for a random amount of time (sometimes up to 5 mins), and then suddenly comes alive again !?!

Is there something I should be aware of with regard to my threads when optimisation is ON?

How can a thread stop processing for a random amount of time when they are never implicitly suspended?
LVL 21

Expert Comment

ID: 24795714
In your logging check your memory availability.  If Windows believes it is running low on memory it will pause to swap memory content out to disk, freeing up memory.  This process is not only slow, but the juggling of memory to disk afterwards can also be very time consuming.  Memory may not be the issue but it is a good resource to target first.  Have you been able to determine which method it stops in during these lapses?

An important thing to keep in mind is this: You are seeing multiple issues.  For now I would continue to run the program without optimization until the other problems have been addressed.  Only when the rest of it is running like you believe it should would I go back and try to re-enable optimization and see what the effects are.

Accepted Solution

JonasMalmsten earned 600 total points
ID: 24796261
To my knowledge optimization should not affect threads other than that they may execute slightly faster, which I suppose can change things a bit when multithreading and there is a problem with concurrency or synchronization. Another thing optimization does is remove variables that are never used so for example if there is a bug writing data to memory where it shouldn't write data, optimization may change things (think working with pointers or arrays where there is no bounds checking).

I think what you said about "form is unresponsive) but ONLY for a period of time" could be a key. This means your main thread is busy (the main thread is easier to debug). By main thread here I refer to the one thread that is processing the windows message queue in your main form. What method are you using to synchronize your threads, I mean to make sure no two threads are writing to the same logfile at the same time etc? If there is only one thread writing to the logfile, how do you fetch the information from the worker threads in order to write the logfile to make sure the worker thread is not updating that same information as you are retreiving it?

When you use for example the Synchronize method, the thread is implicitly suspended while the main thread is executing the method you want to synchronize. If a call to Synchronize takes a long time then your main form will appear to freeze. If for example two threads call synchronize at the same time and in the synchronize method you are waiting for the other thread to do something, you will have a dead-lock (your main form can be frozen for a very long time).

When using threads to pull data from tables (I assume this is from some database). Which components are you using to access the tables, are they thread safe? For example the component library I use for accessing oracle database have a boolean property called "ThreadSafe" that needs to be set = True (default = False). The property is on the TOracleSession component (corresponding to the TDataBase), which is usually shared by several threads. This may vary depending on which set of components you are using.

If it is not a huge amount of code, it would be easier to help you if we had a few snippets to look at. Specially any snippets dealing with synchronization between threads.

Author Comment

ID: 24815394
Correction -- the background threads only continue to write to their dedicated log files when I run the app from the IDE debugger.  The main processing thread still appears to pause for an indeterminable period of time, before jumping back into life.  If I run the app as a standalone exe, all threads stop reporting to files and the app has officialy hung.

Unfortunately there is just too much code to post, but I believe I have found the culprit, though I am not sure WHY it is occuring...

Essentially any of my threads can call a globally declared MyDBService object.  This unit contains methods to pull specific info back from the MySQL DB.  ALL of MyDBService's  methods contain critical sections (see code snippet).

fMyTable represents an instance of a class which encapsulates access to a particular table, and is created/destroyed in the TMyDBService constructor/destructor respectively.  When I put an exception handler into the MyTable class's GetValByMyCode I have discovered that I randomly experience  
"Access violation at address 10011F04 in module 'libmySQL.DLL'. Write of address FFFFFF04" -- from there on in, the app starts to disintegrate.

Methods within the forementioned global instance of MyClass are hit from all threads, and valid values are passed in (prmCode) always.

Are there other rules I should be aware of when implementing critical sections?

function TMyDBService.ValFromMyCode(prmCode: string): string;
    Result := fMyTable.GetValByMyCode(prmCode);
  end; {try..finally}

Open in new window

LVL 21

Assisted Solution

developmentguru earned 600 total points
ID: 24895117
I was just checking on the status of the question and had another thought.  You should include a timestamp on each line of your log.  This could show you the spot where it is taking all this time.  If you want to do this approach, be prepared to wait a very long time.

Another thought regards the use of the synchronize method of TThread.  When your threads update the display are they always using a Synchronize(MyUpdateMethod) approach?  The behavior you are seeing may indicate a thread trying to update the display without running through Synchronize.
LVL 38

Expert Comment

by:Geert Gruwez
ID: 24897702
do your threads use this database object all at the same time ?
and this prmcode is allways protected for writing to too ?

Author Comment

ID: 24949954
developmentguru -- yes, the Synchronize method is used in all instances of display update.

Geert -- yes, the threads will call methods contained in this global db object at the same time, but probably not the same inidividual method.  I thought the use of a critical section is approriate in itself, unless there is something else I need to consider?  How do I protect prmcode? I thought each call to [ValFromMyCode] would create its own instance of prmcode on the stack
LVL 38

Assisted Solution

by:Geert Gruwez
Geert Gruwez earned 300 total points
ID: 25010979
yes, that's right, that don't seem to be the problem
hmmm i can't really come up with anything more
we had this too, it was a sendmessage sent at closure.  
sometimes it didn't work, took a time to find too

Author Closing Comment

ID: 31600119
Put down to fault in third party components.

Featured Post

NFR key for Veeam Agent for Linux

Veeam is happy to provide a free NFR license for one year.  It allows for the non‑production use and valid for five workstations and two servers. Veeam Agent for Linux is a simple backup tool for your Linux installations, both on‑premises and in the public cloud.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

A theme is a collection of property settings that allow you to define the look of pages and controls, and then apply the look consistently across pages in an application. Themes can be made up of a set of elements: skins, style sheets, images, and o…
Entering time in Microsoft Access can be difficult. An input mask often bothers users more than helping them and won't catch all typing errors. This article shows how to create a textbox for 24-hour time input with full validation politely catching …
This is Part 3 in a 3-part series on Experts Exchange to discuss error handling in VBA code written for Excel. Part 1 of this series discussed basic error handling code using VBA.…
Sometimes it takes a new vantage point, apart from our everyday security practices, to truly see our Active Directory (AD) vulnerabilities. We get used to implementing the same techniques and checking the same areas for a breach. This pattern can re…
Suggested Courses

721 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question