[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 222
  • Last Modified:

Discussion: Coding and debugging Multithreaded code.

All,

I'm putting together a tool right now that is using a multithreaded architecture. Its actually in C++ but I'm asking it here because I dont think that is relevant.

As this is probably the first time I actually have an input on the overall design and architecture of a multithreaded app I thought I would canvas some experts! :-)

I am looking more for architectural design and general methods rather than specific techniques for specific problems.

The areas I want to cover are:

1. Design.

What do you avoid or encourage in the overall design? If one thread reads from a structure and one writes to it, how do you ensure that they can both work together in harmony without losing the benefits of multithreading, especially with complex structures such as lists and trees?

2. Debugging.

There are loads of both obvious and wierd race conditions that can sneak up and grab you. Many can take hours of work to track down. From the obvious log file sharing issues to the complex 'walking a tree while it's being updated'. How do YOU go about tracking down the actual problem, avoiding the old 'well it must be somewhere in here so I'll just close the door on the whole structure with a mutex'.

It's a bit wierd to hit a breakpoint and see the other threads carry on running! How does one freeze the whole system at a break point so you can investigate the state and history of all threads at the exact moment of the problem?

What logging methods do you use to ensure that traceback is possible without losing the bug by changing the thread timings?

3. Testing.

As most threading issues arise through rare occurrences, what techniques are good for exposing the issues early in the development cycle?

Paul
0
PaulCaswell
Asked:
PaulCaswell
5 Solutions
 
cwwkieCommented:
> If one thread reads from a structure and one writes to it, how do you ensure that
> they can both work together in harmony without losing the benefits of multithreading

Independend from the kind of synchronisation, I think you definitely need a single function to access shared data (or two, one for reading, one for writing). It is easier to debug, and if there is something wrong with the performance, there is only a single spot to improve.
0
 
grg99Commented:
Just write clear code, with one exit point per function, where you can bottleneck all the locks and unlocks.
Maybe run a thread info window, that displays some interesting info about each thread.
0
 
Kent OlsenData Warehouse Architect / DBACommented:
Hi Paul,

When I write multitasking applications, I solve a LOT of headaches by writing them one at a time, and writing them so that nearly all of the data accesses are to local variables.  This might sound silly, but it's so easy to get caught up in the "I need a variable to keep track of this count so I'll just put it in the globals block" mentality.  The next thing you know, you're updating "Count" from several threads, with disastrous results.

And I've never been tasked with writing a multi-threaded application in C.  (I know several people that have, but not me.)  I'll move on to C++ for the project and let the controls and containment of the C++ wrappers do the heavy lifting.  My own rule is that every thread object is defined as a class.  This way the variables ARE local.  If I want to write to other threads I have to explicitly reference a public variable or one that considers this task a "friend".

Something that might help in C coding is to write every thread as a separate source file.  This way, liberal use of the STATIC modifier let's you create thread variables much like the C++ class.

Debugging is a huge issue, too.  I use the Borland C++ IDE which makes stepping through the code easy, but you do have the issue of the other threads.  If they can run while I'm debugging a thread, I let them.  Otherwise, I build in a mechanism to sequence the threads.  When the thread that I'm debugging completes a pass, it increments a counter and returns control.  The next task picks up the count change, runs a pass, and repeats the process.


Most issues are due to poor design, not poor implementation.  Sit and write the specifications and controls before you start coding and you're a long way ahead.  Writing the code without designing it (or have a LOT of experience at this) is a sure way to shoot yourself in the foot.


Kent
0
Concerto's Cloud Advisory Services

Want to avoid the missteps to gaining all the benefits of the cloud? Learn more about the different assessment options from our Cloud Advisory team.

 
PaulCaswellAuthor Commented:
Kent,

Thanks for that. Lots of good stuff.

May I summarise?

1. If it CAN be local, make it local.

This is good practice in any language and environment and having this habit ingrained in a multithreaded environment is a must. Totally agree.

2. Use the highest level language you can.

I dont object to this, in its place. However, in my opinion, one of the powers of multithreading is the ability to tune for speed and efficiency through the interconnectedness of the threads. If you take the language too high-level you may lose some of these benefits. Obviousy, if the process itself is innately parallel then you are right but when threading is being used primarily for a speed advantage then this can be an architectural mistake.

The tool I am writing is a parser at its center. I will be using multiple threads to run multiple state machines over the data list, some creating new lists as they go for others to follow. I am using STL for the data structures and currently raw C for the threads. I may move the threads into objects in time but it is, as always, the access to the shared objects that causes the most problems.

3. Make it a deliberate act to access shared data.

This is exactly the kind of stuff I was looking for. An architectural guideline.

4. Most issues are due to poor design, not poor implementation.

Agreed! Can you think of some examples of bad design? Or good design?

Great feedback! :-)

Only 10,000 more points !!! Not long now !!! :-)

Paul
0
 
Kent OlsenData Warehouse Architect / DBACommented:

>> Can you think of some examples of bad design?


The 1971 Pinto comes to mind.
0
 
PaulCaswellAuthor Commented:
*Chuckle*
0
 
NopiusCommented:
PaulCaswell, you are guru in C, I know and it looks like questionarie for others :)
But I have few words also.
My recommendaions are based on Unix kernel design and on real kernel code.
There are good links where you may read more about multithreaded programming.
I prefer Sun Microsystems docs: http://docs.sun.com/app/docs/doc/806-6867?q=Multithreaded

Also there is a 'Maurice J. Bach' book 'Unix design' where concurrency issues and workarounds are suggested (for example how to avoid deadlocks).

For list structures there is a common recommendation to avoid data corruption (of course it doesn't avoid race condition):
- use one lock for list head (entire list)
- use one lock for each element
this technique is used in disk/network cache in kernel where cache is a bundle of hashed lists

If your thread requires more then one lock:
- use the same order of locks for lock aquisition in all threads (in all control flows) to avoid deadlocks
- use the order opposit of acuisition to release locks

0
 
Kent OlsenData Warehouse Architect / DBACommented:

Just a thought, but Oracle defines two levels of locking.  User data (rows, pages, tables, etc) use a lock at the appropriate level.

Processes interlock internal structures by a process called latching.  It's a short term (few micro/millisecond lock).

For the purposes of this discussion, we could do the same.  Intertask communication is achieved with a latch, user data is controlled with a lock.


Kent
0
 
efnCommented:
I advise you to treat multithreading as heavy, scary, dangerous voodoo magic.  (Your question indicates you may already see it that way.)  As you are aware, threading bugs can be very difficult to find.

Therefore, minimize the distribution of complexity among threads.  It's better to have one thread that has most of the complex logic and let the other threads do simple things than to have several threads doing complex things concurrently.

Similarly, make the interfaces between threads as minimal and simple as you can.

Serialize processing with event or message queues.  If you have an object where multiple threads may be calling different functions at unpredictable times, don't try to have those functions synchronize their operations on the object if you can avoid it.  Instead, if you can, have those functions queue jobs to be processed by the object one at a time serially.

As Kent suggested, encapsulate synchronization.  Instead of having an unwritten design rule, "thou shalt lock this object before accessing it," design a function with locking built in, that is the only possible way to access it.

Reviews or inspections are good for catching bugs early in the development cycle.
0

Featured Post

Concerto Cloud for Software Providers & ISVs

Can Concerto Cloud Services help you focus on evolving your application offerings, while delivering the best cloud experience to your customers? From DevOps to revenue models and customer support, the answer is yes!

Learn how Concerto can help you.

Tackle projects and never again get stuck behind a technical roadblock.
Join Now