Solved

STL Allocator Question

Posted on 2004-08-11
13
1,384 Views
Last Modified: 2013-12-14

I am reading some strings from a file into a multiset.  The file contains approx 41,500 strings.  I can read them into a multiset using the std::copy function very efficiently (it takes about 200 ms).

The problem is, when I try to destroy the multiset it takes about 6 seconds to deallocate all the little string objects that were created.

I'm wondering if I can fix this delay by creating a custom STL allocator that basically doesn't do anything until all of the items it controls have been deleted, at which time it blows away the entire heap it is using.  In other words, the idea is to give each collection it's own allocator object that creates and owns it's own private heap for the contents of that collection.  Then when the collection is destroyed, have the heap destroy all the little objects it manages with a single delete/free/::FreeMemory call.  

Also, it's worth noting that the string objects in the collection I foresee would need to allocate from the same allocator, so that their contained memory is controlled by the "outer" allocator of the collection so that the actual string contents (and not just the string objects) are all destroyed at the same time.

Ideally, each collection would have it's own FastAllocator object so that it's data would be distinct from any other collections of the same type.  I would like to declare the collection like this: multiset< string, FastAllocator > dictionary;

Of course this assumes that the contained objects have no meaningful state to close gracefully (such as file handles, etc).  While that is not a safe assumption in every programming situation, it's fairly common (at least for me) to have large numbers of small data objects that could be dumped en-masse this way.

Any experiences or code samples would be appreciated.

Thanks,
Brad


Notes:
1. No, this is not a class project.  
2. Yes, I know I could do this another, faster way - e.g. use qsort with text pointers, rather than string objects.  
However, this has become an issue of curiosity now.
3. The code below was compiled & times using MSVC 7 with unmanaged code.

The code follows & error checking, etc. has been removed for clarity:
========
typedef istream_iterator< string  > string_input;
typedef ostream_iterator< string  > string_output;
typedef multiset< string > MyColl;

      MyColl* pDictionary = new MyColl;
      MyColl& dictionary = *pDictionary;

      ifstream ifs( path.c_str() );  // path is a parameter to the function..

      cout << "\nreading file\n" << flush;
      copy( string_input( ifs ) , string_input() , inserter( dictionary, dictionary.begin() ) );  // timing this statement: it takes ~ 200 ms for my input file (release mode).

      cout << "\nfile size: " << (unsigned int) dictionary.size() << endl;

      delete pDictionary;            // this delete takes over 6 seconds in release mode!!  (Interestingly it only takes 2 seconds in "debug" mode, while the copy also takes 2 seconds.)
      
============
      
0
Comment
Question by:millsoft
13 Comments
 
LVL 5

Author Comment

by:millsoft
ID: 11779434
P.S. My real concern here is STL collections & allocator behavior, not sorting.  Sorting is just the example I'm using.  Therefore, no points for alternative sorting solutions that don't address the underlying STL collection/allocator issues.  (Thanks!)
0
 
LVL 55

Expert Comment

by:Jaime Olivares
ID: 11779564
I know you want an specific answer, I don't like such thinks like change a standard library functionality.
But anyway, here is my suggestion:
If you are reading an specific or predictable amount of data, a single buffer is the most efficient way to do this (if you will read all, and all will fit in available RAM memory).
Even you can build a little index in memory to point to every element of the "big bag".
Just a suggestion. Hopes this help.
0
 
LVL 5

Author Comment

by:millsoft
ID: 11779687
Hi Jaime,

>I don't like such thinks like change a standard library functionality.
Huh?  The STL is all about generic programming.  That's why it makes custom allocators an option.

>a single buffer is the most efficient way to do this
Yes, I know.  Please see my original message concerning this (note #2).  

Thanks,
Brad


0
 
LVL 22

Assisted Solution

by:grg99
grg99 earned 125 total points
ID: 11779797
Sounds like there's something fishy going on.

It could be that the higher level allocation and freeing strategies are fighting with the heap allocation and freeing code.  For example, if the deleting happens in reverse order of allocation, the heap freeing code might be forced to search the whole free list in order to coalesce blocks.  If it happened in the other order, it might happen much faster.

I'd write a few more benchmark tests to narrow down the problem.
A few simple tests allocating blocks and freeing them in various patterns may be informative.
0
 
LVL 86

Expert Comment

by:jkr
ID: 11779806
The idea itself is good - however, an allocator is not the easiest object to design. Not as far as providing the actual functionality is concerned, but the efficiancy of the implementation. Basically, the idea is to allocate a big pool that can be freed easily, but keeping that a contiguous area can be expesive if the limits are reached. So, depending on the parameter you have to fill in, I'd suggest providing an allocator (maybe as a singleton) that prealloc's N*sizeof(typical_object) and then successively returs chunks of that memory, and when the prealloc'd size is exhausted, allocates the same again. For a sample, see e.g. http://www.codeproject.com/vcpp/stl/blockallocator.asp ("http://www.codeproject.com/vcpp/stl/blockallocator.asp")
0
 
LVL 86

Accepted Solution

by:
jkr earned 125 total points
ID: 11779815
Arg, C&P strikes again - that should have read

 http://www.codeproject.com/vcpp/stl/blockallocator.asp ("A Custom Block Allocator for Speeding Up VC++ STL")
0
Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

 
LVL 5

Author Comment

by:millsoft
ID: 11784934
Hi jkr,

Ok, things get wierder now.  I tried the block allocator you suggested, and it doesn't compile in VS2003.  Therefore, I rebuilt my project using VC6.  

In that case, the blockallocator does compile, but doesn't speed things up much since the entire process takes only about 200ms INCLUDING DELETING THE CONTAINER.  
Recall that on VS2003, it takes over 6 seconds just to delete the container.

Conclusion: One of several possible things is going on:

1. VC6's STL implementation doesn't actually clean up the strings that are contained in the collection. (i.e. has a Memory Leak)
or
2. VC7 is doing something decidedly different (and in-efficient) with memory allocation and handling.  I am, to the best of my knowledge, not doing anything that would have memory allocation get routed through the .NET runtime.  (VC6 has a small block allocator, which is, IMHO, very fast.  Does VC7 lack one?)
or
3. ?


0
 
LVL 5

Author Comment

by:millsoft
ID: 11784952
Ok, I did a little more checking on the small block allocator in VS7.  It turns out it is OFF BY DEFAULT if the application is running on Win2k or later.  (I'm using XP).

Adding the call:

    _set_sbh_threshold(1000);

to my program increased the speed in VS7 to 130ms total.

0
 
LVL 5

Author Comment

by:millsoft
ID: 11784960
I don't know why MSFT thought they were doing us a favor by turning off the SBH in VS7, but there you have the explanation.

0
 
LVL 86

Expert Comment

by:jkr
ID: 11785130
>>1. VC6's STL implementation doesn't actually clean up the strings that are contained in the collection

No, it does (otherwise I'd be in *deep* trouble :o)

>>2. VC7 is doing something decidedly different (and in-efficient) with memory allocation and handling

That *might* be possible, though I cannot verify that now...
0
 
LVL 5

Author Comment

by:millsoft
ID: 11785802
Here is another pooled allocator that may work on VS2003 - haven't tested it however.

http://www.sjbrown.co.uk/pooled_alloc.html
0
 
LVL 5

Author Comment

by:millsoft
ID: 11786124
I asked MSFT about this in the VC newsgroup, here is the reply:
========================================================================================
The small-block allocator was not ported to Win64 (ia64 and amd64),
and, the LowFragmentation heap was instead introduced as an operative system
feature.
and that one should have the same features of SBH, with a better flexiblity.
BTW, yo ushould be able to use the __MSVCRT_HEAP_SELECT
environment variable to accomplish the same "switch".

That said, there was a thread about 2 weeks ago in this forum
about alternate memory managers,
where this very same topic was discussed in details.

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of any included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm


"brad" <bradl at x_remove_me_x dot costbook dot com> wrote in message
news:OtEu1nIgEHA.3700@TK2MSFTNGP12.phx.gbl...
> Dear Microsoft:
>
> In the RTL, there is a function: _set_sbh_threshold
> _set_sbh_threshold sets the current threshold value for the small-block
> heap. The default threshold size is 1016 bytes for Windows 95, 98, ME and
> Windows NT, and zero for Windows 2000 and later platforms. By default, the
> small-block heap is not used on Windows 2000 and later platforms, though
> _set_sbh_threshold can be called with a nonzero value to enable the
> small-block heap in those instances.
>
> When I took a program that ran fine on VC6, and ported it to VC7, the
> performance was horrible.  It turned out that a collection of small
objects
> in an STL collection were being deleted.  On VC6 it took about 30 ms, and
on
> VC7, it took over 6000 ms.
>
> By enabling the sbh handler with this function the VC7 version performed
as
> well as, or better than the VC6 code.
>
> My question is WHY is the small-block heap not used on Win2k and later??
Is
> there some alternative coding scheme I should be using to get good
> performance?
>
> Thanks,
>
> Brad
0
 
LVL 86

Expert Comment

by:jkr
ID: 11786534
Now, that's interesting - will keep that in mind when it comes to 64 bits...
0

Featured Post

Is Your Active Directory as Secure as You Think?

More than 75% of all records are compromised because of the loss or theft of a privileged credential. Experts have been exploring Active Directory infrastructure to identify key threats and establish best practices for keeping data safe. Attend this month’s webinar to learn more.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Article by: SunnyDark
This article's goal is to present you with an easy to use XML wrapper for C++ and also present some interesting techniques that you might use with MS C++. The reason I built this class is to ease the pain of using XML files with C++, since there is…
IntroductionThis article is the second in a three part article series on the Visual Studio 2008 Debugger.  It provides tips in setting and using breakpoints. If not familiar with this debugger, you can find a basic introduction in the EE article loc…
The viewer will learn how to user default arguments when defining functions. This method of defining functions will be contrasted with the non-default-argument of defining functions.
The viewer will be introduced to the technique of using vectors in C++. The video will cover how to define a vector, store values in the vector and retrieve data from the values stored in the vector.

932 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

13 Experts available now in Live!

Get 1:1 Help Now