Solved

Could someone create an array of thread(fifo)

Posted on 2010-11-16
17
689 Views
Last Modified: 2013-11-23
Could someone create an array of thread like this;
CIULYs multithreading(fifo)
http://www.ciuly.com/index.php/download_file/58/

I have ask so many questions, and I am satisfied, but I would like to begin with another approach like ciuly did in multithreaded_file_search, fifo.

As epasquier said;
Jjust remember that multithreading can be WORSE than 1 thread only.

But I would like to try some new approach like (fifo),  even if I have only 1 cpu core, I have to try all possible approach that would possibly gain a little.

epasquier?
I know your listening, but please let me ask this question once again in other approach(fifo)
It would be nicer if you try,  than telling me another words that makes me laugh.

I already did try this method, because as I look at it was the same as ciulys method.
http://wiki.lazarus.freepascal.org/Example_of_multi-threaded_application:_array_of_threads

But I can do just a basic; not like ciulys (fifo)
http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_26617507.html

One last try;
Thanks
///

This is in my Main_Form unit, that calls [identifying function] from unit [uUtil]

///

uses uUtil, ...

procedure TForm1.BUTTON_iDENtifyClick(Sender: TObject);

var

  ret, score: Integer;

  Ticks : Cardinal;

begin

  score := 0;

  ticks := GetTickCount();



  ret :=  Identifying (score);



  if ret > 0 then

  begin

    ShowMessage(' BlobField Identified_with_ID is  '+ IntToStr(ret) + '  and the Score of the context is  ' + IntToStr(score) );

  end

  else if ret = 0 then

     ShowMessage('not_iDentified')

  else

    ShowMessage('Error');



    ticks := GetTickCount - ticks;

    ShowMessage('Number of Seconds found:' + IntToStr(ticks) );

end;









///

This is the code of unit uDBClass that it to be called by [Main_Form] unit and unit [uUtil]

///

unit uDBClass;



interface



uses ....



  TDBClass = class

  private

    // the connection object

    mycon: TSQLite3Connection;

    // SQL Transaction

    transact: TSQLTransaction;

  public

    // a data set to maintain all records of database

    ds0: TSQLQuery;



    function openDB(): boolean;

    procedure closeDB();

  end;



implementation

...

...

// Open connection

function TDBClass.openDB(): boolean;

var i,i2,size:integer;

tmp:string;

begin

   try

        ds0 := TSQLQuery.Create(nil);



        mycon := TSQLite3Connection.Create(nil);

        transact := TSQLTransaction.Create(nil);



        mycon.Directory := '';

        mycon.DatabaseName := 'dbsqlite.db';



        transact.DataBase := mycon;

        transact.Action := caCommit;

        transact.Active := True;



        mycon.Transaction := transact;

        mycon.Open;



        ds0.DataBase := mycon;

        ds0.DisableControls;



        ds0.SQL.Text := 'select id, template from table1';

        ds0.CreateDataSet;

        ds0.Prepare;

        ds0.Open;

        ds0.ActiveBuffer;



        openDB := true;

  except

        openDB := false;

  end;

end;



// Close conection

procedure TDBClass.closeDB();

begin

  mycon.Close;

  mycon.Free;

  mycon := nil;

  transact.Free;

  transact := nil;

end;



end.











///

This code is unit [uUtil] that calls database unit [uDBClass]

///

unit uUtil;





interface



uses udBclass,...



Var

  // Database class.

  dB: TDBClass;



implementation



function first_called_from_main_form_during_load_for_opening_db

begin

...

// Opening database

dB := TDBClass.Create();

if not dB.openDB() then

 begin

...

    Exit;

  end;

...

end;



//NEED to apply FIFO (first in first out) like ciuly did in his file search multithreading

function identified(var dsRec: TSQLQuery; out score:integer): integer;

  var ret:integer;

  FIeld: TField;

begin

  Field := dsRec.FieldByName('template');

  dsRec.DisableControls;

dsRec.First;

while not dsRec.EOF do

begin

ret := called_from_dll_IDENTIFY( PChar(Field.AsString), score, 0);

if (ret = 1) then

 begin

 Identified := dsRec.FieldByName('id').AsInteger;

 exit;

 end

 else if (ret < 0) then

 begin

 Identified := ret;

 exit;

 end;

dsRec.Next;

end;

 if (ret = 0) then

 begin

 Identified := 0;

 end;

end;



//This is called in unit Main_Form [uMain]

function Identifying (var score: Integer): Integer;

Begin

Identifying := identified(db.ds0,  score);

end;



end.

Open in new window

0
Comment
Question by:systan
  • 10
  • 6
17 Comments
 
LVL 25

Expert Comment

by:epasquier
Comment Utility
You can use TQueue , which is like a TList, but FIFO
Var

 ThreadQueue:TQueue;





procedure InitQueue;

begin

 ThreadQueue:=TQueue.Create;

end;



procedure ClearQueue;

begin

 While ThreadQueue.Count>0 do TMyThread(ThreadQueue.Pop).Free;

 FreeAndNil(ThreadQueue);

end;



procedure NewThread;

begin

 ThreadQueue.Push(TMyThread.Create(True)); // Create suspended

end;



procedure RunNextThread;

begin

 if ThreadQueue.Count=0 Then Exit;

 With TMyThread(ThreadQueue.Pop) do

  begin

   FreeOnTerminate:=True; // Important : you no longer have reference to this tread object in the list

   Resume;

  end;

end;

Open in new window

0
 
LVL 14

Author Comment

by:systan
Comment Utility
Epasquier?

Ciulys code is not using TQueue, but it performs it.

procedure TScanningThread.execute;
var f:TFileStream;
    s:string;
    count:integer;
begin
  while not terminated do
  try
    fn:='';
    FCS.Acquire;// extract first file if any (FIFO)
    try
      if FList.Count>0 then
      begin
        fn:=FList[0];
        FList.Delete(0);
      end;
    finally
      FCS.Release;
    end;
    if fn='' then// no file? go to sleep and get next file when woken up
    begin
      suspend;
      continue;
    end;
    f:=TFileStream.Create(fn, fmopenread);
    try
      Synchronize(DoStartScan);
      while (not terminated) and (f.Position<f.Size) do
      begin
        setlength(s, 64000);
        count:=f.Read(pchar(s)^, length(s));
        setlength(s, count);
        if pos('virus', s)>0 then
          Synchronize(DoVirusFound);
      end;
      if Terminated then
        Synchronize(DoAbortScan)
      else
        Synchronize(DoFinishScan);
    finally
      freeandnil(f);
    end;
  except
    on e:exception do
      logtofile('Error while scannig file "'+fn+'": '+e.message);
  end;
end;


Since your the only one who knows using the TQueue here, can you just put or mix that code to my code above or if you just can do like Ciulys code is much better, and I hope you have downloaded and looked the code of Ciuly.


Thanks
0
 
LVL 32

Expert Comment

by:ewangoya
Comment Utility
Using a queue works fine if you have a list of items that you need to store and then have your threads pick them up for processing. In a file search as the example above, it's not worth it to create objects or pointers for the Queue since you only need to be storing strings.

And again for searching purposes, if you already have the items in list or some container like your dataset, using  a queue does not benefit you much because you would have to transfer the data into the queue and remove it, more steps that will just slow down your search.

Simple threading will increase your search speed since you already separated your datasets into groups, You will not see any improvements on a single core, but on multiple cores, the speed increase should definitely be there.

Just to show how you can use queue on the above code, assume fn is a PChar

procedure TScanningThread.execute;
var f:TFileStream;
    s:string;
    count:integer;
begin
  while not terminated do
  try
    fn:='';
    FCS.Acquire;// extract first file if any (FIFO)
    try
      if FQueue.Count>0 then
        fn:=FQueue.Pop;
    finally
      FCS.Release;
    end;
    if fn='' then// no file? go to sleep and get next file when woken up
    begin
      suspend;
      continue;
    end;
    f:=TFileStream.Create(fn, fmopenread);
    try
      Synchronize(DoStartScan);
      while (not terminated) and (f.Position<f.Size) do
      begin
        setlength(s, 64000);
        count:=f.Read(pchar(s)^, length(s));
        setlength(s, count);
        if pos('virus', s)>0 then
          Synchronize(DoVirusFound);
      end;
      if Terminated then
        Synchronize(DoAbortScan)
      else
        Synchronize(DoFinishScan);
    finally
      freeandnil(f);
    end;
  except
    on e:exception do
      logtofile('Error while scannig file "'+fn+'": '+e.message);
  end;
end;

BTW, its not really a good idea to raise exceptions in a thread
0
 
LVL 14

Author Comment

by:systan
Comment Utility
Epasquier I'm not following that code, I am basing that code, the code flows like this;

start read rec1
start read rec2
start read rec3
start read rec4

end read rec1
end read rec2
end read rec3
end read rec4

start read rec5
start read rec6
start read rec7
start read rec8

end read rec5
end read rec6
start read rec9 //started because there was a delay reading rec7
end read rec8
end read rec7
end read rec9

..
...
That's the flow of cuilys code, how are you going to do that in the code
0
 
LVL 25

Accepted Solution

by:
epasquier earned 500 total points
Comment Utility
You mean you need constantly to have N threads running ?

Well, you are getting into something that is really beyond you at the moment.
To simplify things, you would have to switch to some well known and well tested thread units, like omniThread or other good ones we can find in the web. That choice alone is not trivial, I myself have not looked into it enough and still create my small thread optimizations manually with Thread base class and Synchronize. That is enough for one thread doing one task in background, but not for multi-threaded process of the same task, that needs much complex synchonization mechanism.

Then, I would recommend changing the way your threads are architectured, the frame I already gave you - with one synchronization of N threads working on different dataset until one find something or all fail - could be adapted with a Queue of items to check, but I start to know you, once in that road you won't want to stop until it is all optimized, so I think that is a good time now to think architecture a bit before thinking code.

Here is what I would propose

Threads in new architecture :
(S) : Synchronizer thread : it initialize the search, create the threads, and wait for a Found event of the workers, or an EOF event of the Reader. It returns the found element or error status on its own termination event, which will usually be treated by main Thread

(R) : Reader, it reads DB templates one after another, creating a list of preloaded templates string, but not loading more than a few in advance to avoid loading all the DB before the worker threads have liberated some consitent part of it. (say, 2 times the NB of worker threads).
It must give interface to other threads to :
- get an EOF status
- pick the next template to treat
- be stopped from outside thread (if some template matches)

(W) : Workers. A fixed number N of those threads are created (W0, W1, etc...), They constantly pick the next template to treat in the reader Queue, try to identify it, and signal Found event to the Synchronizer Thread, or terminate with an error status.

That should give you about the best search performance possible on multi-core computers (the number of threads N could be set dynamically to the number of Core visible from windows API. There is a function for that, but finding which exactly  is just a very remote problem)

Of course, maybe you don't realize, to do that will need a few inter-thread synchronization/messaging mechanisms, and these are those that is best to not do yourself, but can be found in units I talked about.
0
 
LVL 14

Author Comment

by:systan
Comment Utility
Epasquier?
I'm trying to analyze your though.

Meantime, a post by rllibby;
"
That's the beauty of the mem mapped file... You get the speed of the memory stream, direct byte access, and don't have to worry about the memory aspect as the system handles paging the file in/out.
"
http://www.experts-exchange.com/Programming/Languages/Pascal/Delphi/Q_21614852.html#15230843

Do you think the post can help me speeding up to a few seconds?


Thanks
0
 
LVL 25

Expert Comment

by:epasquier
Comment Utility
No, not helpful at all in your case. You don't get data from a file, but from a Dataset, remember ?

Besides, I'm still not sure what is the proportion of time use to read data and to analyse data (with a dll if I remember). That is something that must be known if you want to work on the speed of the whole process.

Maybe you should test another DB type with some direct access instead of MSAccess via ADO.
have a look at that :
http://www.componentace.com/bde_replacement_database_delphi_absolute_database.htm
0
 
LVL 14

Author Comment

by:systan
Comment Utility
Epasquier;
I've managed to used the tmemorystream, remember that I told you about dancing, so I dance with it, I though it would be 1 to 2 seconds, but instead it is 11 to 12 seconds, and is faster than using tfilestream(only a 1second diff), and tsqlquery(dataset[5seconds diff])

this is the issue;
I've got 620mb free memory.
*using tmemorystream, when form loads the dataset, write it on memorystream, close dataset.  the remaining memory is 470mb free.  and during scanned time, it does not move the memory size.
*using dataset, when form loads?, memory free is 610+-mb, but when it starts scanned up to the end of the last record or when no record is found?, the free memory is 380mb, and it stucks there everytime I scanned, and it takes 14  to 16 seconds to the last record or none.

Then TmemoryStream is much faster, is less memory consumed.
Probably because of the blobfield.

Oh, the testings are base on a 2 threads, 1 or 2 threads are same in seconds. both.

So, thus rllibbys post can help? by mapping the memory?
I will try to analyze the code if is applicable to my app.


Thanks epasquier, I hope you understand.
0
Do You Know the 4 Main Threat Actor Types?

Do you know the main threat actor types? Most attackers fall into one of four categories, each with their own favored tactics, techniques, and procedures.

 
LVL 14

Author Comment

by:systan
Comment Utility
ill try also absolute database, please read my last post.
0
 
LVL 14

Author Comment

by:systan
Comment Utility
and Im using sqlite, I even test it with mysql, there the same because I am using tsqlquery of dbexpress.
0
 
LVL 25

Expert Comment

by:epasquier
Comment Utility
as I told you once, if you want absolute speed and can afford loading all in memory, then your best option is to load all the templates in a TStringList and scan from there. With that, having a neat multithread system such the one I told you lately will be absolute performance
0
 
LVL 14

Author Comment

by:systan
Comment Utility
Ok;
I have already told you before that I've tested using TStringList;
again I test, here's the result;
with 628mb free memory, during the load of dataset to StringList, close dataset,  the remaining free memory is 312mb, but the memory never goes down even to the last record found or none.
it takes 14 to 16+ seconds to get at the last record.
The free memory scenario is just like using the array of string, but array of string is faster than TstringList.
TStringList is only powerful because of the indexes advantage, and on this case?, I don't use indexes because the templates are scanned sequentially from 0 to last.
Therefore TmemoryStream has the best solution.

*using tmemorystream, when form loads the dataset, write it on memorystream, close dataset.  the remaining memory is 470mb free.  and during scanned time, it does not move the memory size.
it is 11 to 12 seconds fast.

so epasquier?, since you have more experience than me regarding programming stuff, specially delphi.
Can you answer my question;
 thus rllibbys post can help? by mapping the memory? and speedup a little?

Thanks
0
 
LVL 14

Author Comment

by:systan
Comment Utility
I hope you have read my last post above.^
Ok;
again I test your code using StringList with insertObject takes 14 to 16 seconds.
So, I tried it using the add string, tstringlist.add(fields[1].asstring);
it takes 12 to 13 seconds, the free memory still remains at 312 from 628mb free.

Using TMemoryStream is the best option here.
0
 
LVL 14

Author Closing Comment

by:systan
Comment Utility
epasquier? see you on delphi zone

Thanks
0
 
LVL 25

Expert Comment

by:epasquier
Comment Utility
again I test your code using StringList with insertObject takes 14 to 16 seconds.
So, I tried it using the add string, tstringlist.add(fields[1].asstring);
it takes 12 to 13 seconds, the free memory still remains at 312 from 628mb free.
Using TMemoryStream is the best option here.

I'm still quite sure you have something wrong in your algos, either the one with MemoryStream or the one with TStringList. Besides, when I say TStringList is faster, I mean AFTER loading it. You Load once, which takes 15s, then you scan as many time as you want with Multi-thread search on that TstringList which should be lightning fast, even without Index(especially without index I should say) because you only pick the elements in their sequential order.
If I get your estimated times for various scenarios, I would bet on a Multi-threaded search with 4 threads on a 4 core machine to be well below 1s, probably <0.2s on my Core i7. And about 2s with your one core Celeron
0
 
LVL 14

Author Comment

by:systan
Comment Utility
epasquier, email me, and I will reply to you with the full source code.
lets see about your 2s on one core celeron.
Ok?


Thanks
0
 
LVL 25

Expert Comment

by:epasquier
Comment Utility
Systan, I don't think I can do that, it's against ExEx rules that we do full applications directly. Or you can hire me and that's a completely different story ?
What is possible, is that we do it step by step using ExEx for each step, so that other experts could help and other users learn from it.

The first step is I think the export of your DB in a stringlist, and the loading/saving of that stringlist in a binary file (.TXT files with LoadFromFile is to be avoided because your strings could contain #0 and #10, #13 which might pose problems)
Just ask a question about that in Delphi Programming zone
Then we will have what we need to go a step further : multithreading search on StringList, with ability to exchange test data files
0

Featured Post

How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

Join & Write a Comment

Programmer's Notepad is, one of the best free text editing tools available, simply because the developers appear to have second-guessed every weird problem or issue a programmer is likely to run into. One of these problems is selecting and deleti…
Update (December 2011): Since this article was published, the things have changed for good for Android native developers. The Sequoyah Project (http://www.eclipse.org/sequoyah/) automates most of the tasks discussed in this article. You can even fin…
This tutorial covers a step-by-step guide to install VisualVM launcher in eclipse.
The viewer will learn how to use NetBeans IDE 8.0 for Windows to connect to a MySQL database. Open Services Panel: Create a new connection using New Connection Wizard: Create a test database called eetutorial: Create a new test tabel called ee…

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

8 Experts available now in Live!

Get 1:1 Help Now