Link to home
Start Free TrialLog in
Avatar of magnussonms
magnussonms

asked on

Delphi multi-thread program apparenty not using a multi-core processor

After decades of programming I am taking my first steps in parallel process programming using as always Delphi Professional, now XE3 on a Windows 8 professional Toshiba Satellite A665 laptop.
System.cpucount returns 4, as it should.
I have now programmed some minimal test programs and copied and tried some from the web running anywhere from 1 to 16 threads simultaneously. To my surprise the processing time is, however, just the same or somewhat greater than if I simply run the same processing sequentially. I wonder if the program is using all the cores or just one, whether some special switches have to be set, etc. (I have also read somewhere that Windows does not really allow parallel use of cores?)
ASKER CERTIFIED SOLUTION
Avatar of Sinisa Vuk
Sinisa Vuk
Flag of Croatia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Consider
Multi-threading will not neccesarily increase your processing speed.
All depends on what type of processing you are doing, if one thread has to wait for resources beign used by another thread, then you dont get any speed advantage.
threading has a lot of pitfalls
the most import items to pay attention:
1: protecting resources from simultaneous access
2: synchronisation of threads: like a thread can start when a other has finished

consider your program like a bath tub and all the people like threads
everybody has a task to do: take a bath
synchronisation is very important here:
>> you probably don't want everybody to take a bath together: the six kids + parents + grandparents
>> you need to set  certain rules like who can actually take a bath together

the data would be what items go in bath with the person:
rules need to be set here too: having a load of ducks in with the kids is nice,
but granddad will probably like it more if they are removed before he goes in
It would help if we knew more about what you are attempting.

  Ray tracing is a great example of multiple processor cores being a great value.  In Ray tracing a great deal of math is done to render every pixel versus an in memory list of 3D objects.  Since each pixel can be rendered independently it would help to have thousands of cores.

  Other types of work are not easily threaded.  If you were doing database updates in a set of database tables that use foreign keys then each would need to wait until the previous had finished.  Without waiting until the previous one finished you would have errors from the same SQL updates that would not cause errors if other commands had finished.

  One of the biggest mistakes made by people new to multi-threading is trying to have all of the threads causing live screen updates.  Each update requires a synchronize statement.  This, in effect, makes all of the threads wait on each other.  If you need to have threads running, limit their updates of the screen to, something like once a second.
Avatar of magnussonms
magnussonms

ASKER

The task I am dealing with seems well suited for multi-cpu processing. It is basically the processing of independent samples where all the data and work arrays for each can be stored in a separate record, object etc. Each processor could thus work on a sample without any interaction with the others. (Actually, tasks within each also have this character providing another parallel processing approach.) The job would be finished when all the samples had been analyzed (using computation intensive pattern detection in each case, which should lend itself well to truly parallel processing). Moreover, there is no need for UI updating before the longest task has been completed. So, as far as I can see, this situation is a particularly well suited, rather like the ray tracing mentioned above (even more so).

Wile I have not done any parallel process programming until now, I have read about it for some time. As a matter of fact, my main research area during the last few decades is the analysis/modeling of real-time (social) interaction in humans, animals and brain-cells, developing and using highly optimized, but still computation hungry algorithms so I do not underestimate the the complexities of interactive parallelism, but the present case seems particularly simple.

It has come as a surprise to me that in Delphi (Professional XE3), assigning tasks to particular cpu-cores appears quite difficult at least partly due to the way Windows operates. Simply using parallel threads does not provide any speed advantage as I have now found out as sequential processing is easily faster while using much less processing (using sinisav's program). What I really need is a to assign each independent task to a separate core -- as its only (or main) task. I think this is at  the heart of my issue. I have read about the OTL library, which according to Geert_Gruwez above should allow this, but has a "steep learning curve". I will now continue studying it.

All the replies have been useful, but I am still looking for a realistic solution and I will write back when I have found it or postponed trying until better tools become available.
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
I will need more time to find out what I can realistically do as my programming experience  is almost exclusively in Fortran and Delphi. Many thanks for most valuable help.
check the pipeline technique on OTL

I got some multi threading programs working with OTL, so i thought i'd come back on this.

 you indicated you need to process samples
pipeline lends itself to work in stages > first process is to get a sample, next process it (in different ways), capture the processed data and store it

pipeline := Parallel.PipeLine.Throttle(10000)
  .Stage(GetSample)
  .Stage(ProcessSample).NumTasks(4)
  .Stage(GatherProcessedData)
  .Stage(StoreResults)
  .Run;

Open in new window


Each procedure of the Stage is something like this:
procedure TDataProcessor.ProcessSample(const input, output: IOmniBlockingCollection;
  const task: IOmniTask);
var 
  Value: TOmniValue;
begin
    while not Task.Terminated do
    begin
      if input.TryTake(Value, 1000) then
      begin
        // Process the Value (can be any object)
        
        // when finished ... pass the calculated data to the next step
        repeat until TryAdd(ProcessedData, 1000);
      end;
  end;
end;

Open in new window