Link to home
Start Free TrialLog in
Avatar of mhervais
mhervais

asked on

Multithreaded Memory Management

I have a multithread app that runs out of memory when my list of waiting requests is big, and when I fire too many threads simultaneously.

Though I am working on reducing the footprints of my threads, I would prefer a solution that would stop me from creating a new thread when it has not enough place to run

this means that I must know

1) for the threads I run what ressources they use in worstcase,

2) what free ressources I really have to work with.

does anyone have a clue about it?

thanks
Avatar of rwilson032697
rwilson032697

You could use a thread pool to restrict yourself to a number of threads which you know is safe...

Cheers,

Raymond.
Mmmh, as often said it is wise to restrict the amount of threads for a process to a specific number. I often heard 16 would be a good limit, but I think it could well be 50.

Your main problem, I think, is that you want too many requests to be executed simultaneously, which simply makes no sense as quickly too many threads have to wait to get CPU time. To extend Ray's suggestion I'd say that you use a kind of thread manager which launchs a new thread for every request as long as the maximum number is not exceeded. After this point it just adds the request to the thread with the shortest task list. A task list is hereby a kind of FIFO stack which can hold  descriptions/environments for requests.

Ciao, Mike
Avatar of mhervais

ASKER

thanks raymond an mike, but this is not an option.

I already use a thread manager that limits my thread number to what I beleive can be.

the proble is that I a using threads to get html pages, and I have a dreadful connection so, they don't take the cpu's time too much because the threads spend their time waiting.

on the contrary, I try to issue as many threads as possible in order to avoid cpu wait time

regards, Marc
Sounds like your'e not looking for a thread manager, but a "shared" resource manager...

Can you be more specific ?

Regards
KE, thanks to join, what else whould you like to know ?

I made a component to issue multithreaded HTTP gets.

When I submit a get to this component, I do nothing else than to add a request to an entry TList.

but this component has a permanent tread that wakes up every third of second that looks hom many threads are active.

if the count is less than some limit, it moves the some demands back from a waiting list to a current list, and creates the number of availables threads
with a free on terminate option.

All this works almost well except for two points :

1) I do not know how many threads I can create, and I would like to be able to manage my resources,

2) I am not convinced that the threads use their memory witout interferring with the memory of the neibours.

So my question is : What functions can I use in order to acheive a good control over my memory (and pointers)  resources, and is there a limitation on the memory that Delphi is able or willing to allocate

regards, Marc
Hi Marc,

1. As Lischke stated 16 is the recommended number from M$.
I would say that you can use more if most of them are in a wait condition - but it's not good practice to have more than (around) 16 "active" threads - it's a question of what executes faster, not limitations to the system.
If you have 16 active threads you are also using all the CPU power - therefore it's a waste of resources to switch context's more than nescessary.
I don't think that you are interested in a MainThread that has a less than 6% of the available CPU power...
Well, they can have priorities, but there's is substantial overhead in switching between "active" threads.

2. Well, have you looked at the different synchronization objects ?
In the SyncObjs unit...
To wait for other thread's execution points you use events. To protect memory you use the Locking objects like TCriticalSection etc.
In general, you need to protect "referencing" memory locations (like list's, strings, etc.). Simple type's like integer and boolean don't need to be protected (unless they are also referencing something, or used in such context). There are fx. no concerns about reading/writing a global boolean/integer, as it's done by a single CPU instruction (which offcourse never are interrupted).
TThreadList's are also a good example on how to protect memory.

To be specific on your solution, I think you can benefit by using event's.
It actually takes some time to Create and Destroy a thread. You should think of reusing the thread's in some way.
Say, that instead of dynamically create and destroy threads, you make a fixed number of thread's. These thread's activate whenever they get a chance to grab an URL from a "work list". The thread pop's the URL from the stack (work list) and when finished push the result onto another (as an idea).
So the thread will start by entering a while not terminated do WaitForSingleObject( worklisthasdata, 200 );
The worklist manages the worklisthasdata event and reset's it whenever the listcount is zero. When you push an URL onto the worklist, fx. from the mainthread, it set's the event to signal that there's data available.
List access should be serialized, and the thread has to check that it can pop an URL (this must be protected from simultaneous access). In case more thread's are in a waiting state they will all try to grab an URL from the list (and the list may only contain a single entry). So you will need to "serialize" access here - fx. by wrapping the list count in a critical section (only one thread can read/write this at any time).

I throw in an example here that demonstrates "locking of memory" (it doesn't protect the count) - there may be some thing's that you can use...

//------------------------------------
  pVarArrayRec = ^TVarArrayRec;
  TVarArrayRec = record
    V: Variant;
  end;

  TVarArrayQueue = class( TQueue )
  private
    FLock: TRTLCriticalSection;
    _QueuePush : TEvent;
    _QueuePop : TEvent;
  public
    Constructor Create;
    Destructor  Destroy; override;
    procedure   Push( V: Variant );
    function    Pop: Variant;
    procedure   Clear;
    Property    QueuePush: TEvent read _QueuePush;
    Property    QueuePop: TEvent read _QueuePop;
  end;

{ TVarArrayQueue }

procedure TVarArrayQueue.Clear;
begin
  EnterCriticalSection(FLock);
  while Count>0 do Pop;
  LeaveCriticalSection(FLock);
end;

constructor TVarArrayQueue.Create;
begin
  Inherited;
  InitializeCriticalSection(FLock);
  _QueuePush := TEvent.Create( nil, false, false, 'QuePush' );
  _QueuePop := TEvent.Create( nil, false, false, 'QuePop' );
end;

destructor TVarArrayQueue.Destroy;
begin
  Clear;
  _QueuePush.Free;
  _QueuePop.Free;
  DeleteCriticalSection(FLock);
  Inherited;
end;

function TVarArrayQueue.Pop: Variant;
var
  x : pVarArrayRec;
begin
  EnterCriticalSection(FLock);
  try
    x := pVarArrayRec( Inherited Pop );
    QueuePop.SetEvent;
  finally
    LeaveCriticalSection(FLock);
  end;
  Result := x.V;
  Dispose( x );
end;

procedure TVarArrayQueue.Push(V: Variant);
var
  x : pVarArrayRec;
begin
  New( x );
  x.V := V;
  EnterCriticalSection(FLock);
  try
    Inherited Push( x );
    QueuePush.SetEvent;
  finally
    LeaveCriticalSection(FLock);
  end;
end;

As you can see, it's only nescessary to protect the list (itself) from beeing corrupted.
This is also simple since the contained data is not randomly accessed. In case of random-access it's a (very) good idea to protect access to the data (like a TThreadList).

Well, back to your case...
I don't know how you get the data by HTTP, can you specify this.

Regards
Thanks for your texte KE.

In fact I do not reuse the same threads, but I create new ones instead.

The reason is that if I look at the performances of my internet provider, avoiding loosing time in my mainthread is not an issue. on an other hand, I am using third party code I do not rely completely in, and I found that doing that would create a cleaner context, or at least, it would allow me to trap the leaks if I need it.

Of course I protect my resources thru criticalsections, and I even use the synchronize method to deal with the main thread though I would not really need it.

but my question is really about mastering the resources that are allocated just like for instance boundcheckers would do.

regards, Marc
OK,

1. Which resources are you interested in - Memory ?

2. Is it per thread basis (I don't hope so) or overall ?

If it's memory, you will have to look at the Win API to get a clue on how much your process uses. The getHeapStatus function is only giving you the amount of memory allocated by the Delphi heap manager (which is mostly used for small allocations, due to speed).

In general, try to keep the thread's as clean as possible - which means, no Synchronize Calls to the main thread, and no resource alocations inside the thread. If you need to read out progress etc. from the thread - query them (from a timer), instead of letting them synchronize with the main thread - especially if you have many threads.

Regards,
Kenneth
Thanks KE, with this I begin to get to something.

For question 1, I would be glad to have the pointers too, but I mostly use memory.

I don't understand you question 2

for the threads , I think that if I would need it, the sizeof function could help me,

You tell me that the heap manager is mostly used for small allocations. what is used for large ones ?

thanks, Marc

To start with the last part - which is also the most confusing one...

Delphi manages it's own "internal" heap where it allocates small chunks of memory and under certain sercumstances also larger ones. Delphi allocates "medium" chunks of memory from the Windows Heap. The reason is that it would be very slow if you fx. allocated 4 bytes with getmem (or changed a string variable, etc.) on the Windows Heap. The Windows heap is just not suitable for this.
On the other hand, when you fx. create a TMemoryStream the Delphi Heap is not used, since it's faster to bypass the "intermediate" delphi heap. As an example... when you add a single byte to a MemoryStream that is say, 1MB in size, the whole memory is reallocated on the Windows Heap (and that is why we ALWAYS figure the size in advance - right :-).

So we are dealing with two heaps, which I believe is why you have problems with determining the amount of memory that your application uses - and so, how to manage the threads safely.

About Q2, I mean wether you would like to see allocated resources pr. thread or by process.

Well, I don't understand what you need pointers for - what are you going to do with them ?

Regards
You are right KE, pointers is not my need at the moment, for them, it was only curiosity.

For myself, I do not allocate memory specifically, so if I understand well, I cannot rely on the free space information that the memory manager tells me, because the memory manager is often able to get more space from windows . So how can I know when to stop ?
The reason why you can't rely on the Delphi Heap manager (getHeapStatus) measures, is that much of your allocated memory is not passing through the Delphi Heap, it's allocated by the Windows API call's, like GlobalAlloc...

I've earlier issued a question on how to obtain the size of Windows allocated memory. DrDelphi was close with this API call:
GetProcessWorkingSetSize

It tell's how much memory your process has allocated (use the Max. readouts) - try to see what numbers you get from it - you may get VERY surprised...

Regards
very interesting function. why should I be surprized ?

Though I don't need it because I have rather much memory in my computer for what I do, it seems that I can even prevent page faults with this type of function.

I think that if you can give me the function for knowing the global amount of free mem, you will deserve the points

cheers, Marc
Suprise:
Well, all the large chunks of memory are allocated by the Win API. So if your  application only uses getHeapStatus you may well get surprised...

This is what you need, look it up:

The GlobalMemoryStatus function retrieves information about current available memory. The function returns information about both physical and virtual memory. This function supersedes the GetFreeSpace function.

VOID GlobalMemoryStatus(

    LPMEMORYSTATUS lpBuffer       // pointer to the memory status structure  
   );      
 

Parameters

lpBuffer

Points to a MEMORYSTATUS structure in which information about current memory availability is returned. Before calling this function, the calling process should set the dwLength member of this structure.

 

Return Values

This function does not return a value.

Remarks

An application can use the GlobalMemoryStatus function to determine how much memory it can allocate without severely impacting other applications.
The information returned is volatile, and there is no guarantee that two sequential calls to this function will return the same information.

See Also

MEMORYSTATUS
sorry for the delai KE your comments are very valuable, but I have been working days and nights.

Please issue an answer so that I can give you your points

regards,

Marc
ASKER CERTIFIED SOLUTION
Avatar of KE
KE

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial