Link to home
Start Free TrialLog in
Avatar of JD_Steele2

asked on

Multi-Threading in Delphi - Best practice for my problem? Or add to my problem?

I'm going to try and put this into a story at the end, but here's the technical rundown of what I'm needing to do. Currently we have 3 different scanner platforms and 3 different 'Export' apps written in Delphi to post data into a DB for other workflow applications to process. My current task is to basically take the 3 'exporters' and combine them all into a single all-encompassing application.

First thing each of these exporters currently do is check for active 'job's' in the ExportJob table (each Exporter has it's own database currently which will be changed to ONE DB handling all 3). It then goes out to the ExportDirectory for each 'Job' which contains the files output by the individual scanners for that job. It then takes those files, opens them, grabs data from them and then stores them in the DB which allows the 'Import' app for that particular job to grab that data and import it into the workflow.

Here's my problem. I can get this to work with ONE DB, but one of the scanner platforms which scans a majority of the work will cause the other 2 to have to wait before processing their work. Currently all 3 process at the same time (3 different applications running at once). I suggested to management this problem and that we should have 3 instances of this new rewrite going at the same time (1 instance for each scanner type). I was told to make this work with ONE instance of the application and to use Threading to handle the work.

I'm not sure what 'best practice' would be on this, so I'm tapping into the best minds in the business for suggestions/comments, ect...

Here's my 'real-life' story to hopefully make the processing steps make more sense:

There is a street which has many Wholesale Supply warehouses (Scanner Platforms). Each warehouse has very loyal customers (ExportJobs) that show up many times per day to pick up supplies (data files ready for exporting) these loyal customers NEVER do business with the other warehouses. Throughout the day each warehouse stages the supplies for each loyal customer (scanner output files). The customer comes by and knows right where to do to get what's been staged for him (ExportDirectory). He takes the supplies home and waits for all the other loyal customers in the world to finish picking up their supplies (each customer is very nice and fair and only starts working when everyone else is ready). At that time each customer starts creating the end-product for his customers (workflow apps).

(Step 1 of process): All customers get their supplies before any customer begins making end products..
I need to make sure that Loyal Customer 'A' can walk into warehouse 'A' and handle business at the same type that Loyal Customer 'B' works with warehouse 'B' and handles business. This needs to work for any number of warehouse/customers being able to do business at the same time (C, D, E, ect...).

(Step 2 of process): All customers have their supplies and now it's time to start manufacturing products..
All the different loyal customers can begin assembling their supplies and shipping out the result to THEIR customers (storing data in main DB for workflow apps to pick up) at the same time and not have to wait until the 'Warehouse A' customers (of which there are TONS of) are done before the 'Warehouse B' customers (just a handful) can begin their work.

I hope I didn't confuse the heck out of everybody here. What I'm looking for is what the best practice for accomplishing something like this is, whether that would be multi-threading or some other method. I also want to know if this is accomplished through multi-threading how do I handle the termination of the threads and make sure they ARE terminated OR can I create a dynamic # of threads at the begining and reuse them over and over (just kill them off when the app is stopped and closed).

Thank you to ANYONE who looks at this and takes the time to assist and comment. I appreciate it.

Avatar of Geert G
Geert G
Flag of Belgium image

Link to home
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of JD_Steele2


Ok. I'm going to go back to calling things what they are to see if I'm understanding what you are saying and to give a more clear indication on what I'm trying to do.

High Level summary of what my company does:
I have 3 scanner platforms (IBML, OPEX and Kodak) that we use to scan documents for our clients.
There are many ExportJobs (clients) for each of these scanners (IBML having the majority).
Each scanner platform creates either a record in another table (IBML) OR in the case of the OPEX and Kodak scanners, they create .DAT files which are located in specific directories.
We import data into another database (DBScannerExporter) based on the data contained in the tables or found in the .DAT files.
Client workflow applications go to the DBScannerExporter database and locate records that match their JobNumbers that are in a stage of 'Exported' and begin importing that data into their specific workflows.

This is how I understand what you were saying:

1. I create 3 threads to get the work that's ready for each scanner platform (IBML_GetWorkThread, OPEX_GetWorkThread, Kodak_GetWorkThread)

2. I have 'flags' set for each 'GetWorkThread' that will indicate when they are done getting work. Once a flag is set to true, I spawn an 'Export' thread which will then start exporting the data. Should I wait to spawn the export threads until ALL 'GetWorkThreads' are completed or would it be OK to create the ExportThreads and have them start even if the other two may still be in the process of getting work?Also, should I spawn an ExportThread for each 'ExportJob' that goes with the specific scanner platform I'm working with or ONE ExportThread per scanner platform which exports the entire batch of ExportJobs?

For instance, the Kodak platform has a total of 4 ExportJobs (clients we scan documents for). I create a Kodak_GetWork thread that goes to each of the 4 ExportJob's 'ExportDirectory' to find and locate work which gets stored in an Export queue table that contains the ExportJobNumber, FilePath to the .DAT file, ScannerType (KODAK in this case), DateTime and a Processed flag. Once this thread is completed it will set a flag on the main thread 'bKodakExportReady' to true and the Kodak_GetWorkThread will 'FreeOnTerminate'. At that time do I create '4' KodakExportThreads (one for each ExportJob) and pass it the JobNumber so all 4 get worked on at once or would it be cleaner to use '1' KodakExportThread which does all 4 in one thread?

3. As each ExportThread completes it's duties it they will terminate gracefully and set a flag (bExportDone) to true. Once all 3 flag equal 'true' then, I'll start over at step one and do the whole process over again.

Am I following you correctly?
the question you put in on 2 you will have to answer: wait for all worker threads to finish or wait for just that 1 worker thread
to answer this question: is any data needed from 2 worker threads ? yes > wait for all to finish else wait for 1

the other question: 1 ExportThread for each ExportJob or 1 ExportThread for the batch :
depending on the load.  the first may have many threads -> can be handled by limiting the maximum number of threads that do exports simultaneously.  the more threads you work on at the same time, the slower single jobs gets
I tend to limit the number of executing threads to a maximum # depending on the task they need to do and the time they work at it.

Total time to finish batch is key factor here

Ok. I believe I could start working on exporting the work brought back by a particular scanner worker thread before the other scanner worker threads have completed. Each one is unrelated to the other except for the fact their data is plugged into the same DB structure now instead of 3 separate ones.

Currently the 3 apps that handle these are single threaded, so I could play around with creating an ExportThread for each Job or just have it start processing the work available for all Jobs based on the ScannerType value. (1 thread = Get all available work for scannertype) (Multi-Threads = One thread per JobNumber for current scannertype).

I think I've got my head around the design now. I appreciate the time you spent consulting with me on this and giving me a better understanding on the methods I should use. I'll award you the points.

Thanks again for the help.