Link to home
Start Free TrialLog in
Avatar of TwentyFourSeven
TwentyFourSeven

asked on

Preallocation for vector of vectors

Let's say I've got a vector of vectors....

vector<vector<int> > myVector;

If I know the size of the outer and inner vectors, what is the most efficient way to pre-allocate ?

I'm also assuming the pre-allocation should be done outside any loops ?
ASKER CERTIFIED SOLUTION
Avatar of evilrix
evilrix
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
^^^ pre-allocates a matrix of 100,100.
You could do :
std::vector<std::vector<int> > myVector(x, std::vector<int>(y));

Open in new window

oh - too slow :)
Avatar of TwentyFourSeven
TwentyFourSeven

ASKER

You guys are quick..... I'd hardly clicked the submit button on the question ! ;-)

Will try it out, but no doubt both solutions willl work perfectly, as always !
Is there a recommended limit to pre-allocation ?

As a test, a dynamically allocated array of  10 millon by 7 works fine ..... but myVector(10000000,vector<int>(7)); ends up with a segfault ?

A random guess on my part would be that pre-allocation uses/needs contiguous memory whilst dynamic doesn't ?

>> Is there a recommended limit to pre-allocation ?
http://www.cplusplus.com/reference/stl/vector/max_size/
>> A random guess on my part would be that pre-allocation uses/needs contiguous memory whilst dynamic doesn't ?
Vector memory is contiguous, but that's on a per vector basis not for the whole matrix, which is a collection of vectors.
That and the current memory usage of your application (lots of memory already allocated, highly fragmented, etc.)

>> As a test, a dynamically allocated array of  10 millon by 7 works fine ..... but myVector(10000000,vector<int>(7)); ends up with a segfault ?

Note that the main difference here is that you (probably) tried allocating all memory in one block, while the vector doesn't necessarily do that.

Note that 7*10000000*sizeof(int) is already around 280MB on a 32bit system. That's a lot of memory.
>> while the vector doesn't necessarily do that.

Correction - I should have said "the vector of vectors".
So I guess this kind of begs the question, is the performance saving on pre-allocating such a large VoV worth the hassle of trying to imitate what dynamic allocation does by default (i..e not allocating all on one block ?)

>>  is the performance saving on pre-allocating such a large VoV
If you don't reserve memory in a vector if has to allocate as it grows and this means it has to keep re-copying the internal buffer tot he new one. This can be VERY slow. If you can pre-allocate it is normally worth doing so but you do have to be mindful of how much memory you are useful... obviously :)
Also... since a vector isn't sparse... if you need to insert at a high index you have to grow the vector to that size anyway.
Have increased the points since this is obviously a bit more complicated than I first thought.... ;-)

Alright, so assuming I've got access to enough memory and the feng shui in my code is balanced etc. etc., how do I go from .....

vector<vector<int> > myVector(100, vector<int>(100));

to this mysterious multi-block computer friendly way of doing things ?  I wouldn't know where to start !

I'll then leave you in peace (with a few extra points under your belt ! )   ;-)

>> to this mysterious multi-block computer friendly way of doing things ?  I wouldn't know where to start !
Well, 1st Q is why do you need such a massive matrix? If the matrix is going to be mostly sparse maybe a map of maps would be better?
Actually, before I answer that, maybe i'm just doing something stupid in my code.

std::cerr << "Vector Size: " << myVector.size() << " Vector Row Size: " << myVector[0].size() << endl;

With dynamic allocation of whatever value, I get the expected result :
Vector Size: X Vector Row Size: Y

However, once I introduce preallocation, I suddenly get double the values, 2X and 2Y !

Perhaps that's why things are crashing, the computer is allocating a massive amount of RAM !

My code is fairly straightforward though, so I don't know where I'm going wrong !
vector<vector<int> > myVector(10,vector<int>(8));
infile.open("somefile");
	if (!infile.is_open())
		return 0;
	while (!infile.eof()) {
		getline(infile, strLine, '\r');
		if (!strLine.length())
			continue;
		sscanf(looking for stuff.....);
myVector.push_back(vector<int> ());
		myVector[row].push_back(element);
row++
}

Open in new window

Did you mean this?

You already have a 10 x 8 matrix there is no need for any push_backs. The constructor is creating, it's not reserving.

Put another you are doing the semantic equiv of this...

v.resize();

not this...

v.reserve();


vector<vector<int> > myVector(10,vector<int>(8)); // creates the matrix of 10 x 8
infile.open("somefile");
        if (!infile.is_open())
                return 0;
        while (!infile.eof()) {
                getline(infile, strLine, '\r');
                if (!strLine.length())
                        continue;
                sscanf(looking for stuff.....);
                myVector[x][y] = element;
row++
}

Open in new window

Probably !  Let me go try that !

Also, silly question  is it normal for the vector.size function to display a size one larger than the actual size ?  i.e. I would count a vector [0] - [5] as size 6 but .size() displays it as 7 ?
Vectors are start from 0 so if you insert 0,1,2,3,4,5 the size is 6 :)
>>  I would count a vector [0] - [5] as size 6 but .size() displays it as 7 ?
Oh just read it again... um, no that's not right!
>> Oh just read it again... um, no that's not right!

Yeah, my mistake !

However your solution above doesn't seem to be working !

Before pre-allocaton, this worked fine.....

myVector[myRow].push_back(tempData[0]);

However, with pre-allocation, this does not ...

myVector[myRow][0] = tempData[0];

cout << tempData[0] << "/" << myVector[myRow][0] << endl;

always outputs myVector[myRow][0]  as zero.


On my way home. I'll look at it when I get there. Meanwhile can you post all you code please?
Thanks.  I'm looking into it a bit too. Can't believe making such a simple change would break so much. ;-)

See if this example helps.
#include <iostream>
#include <iomanip>
#include <vector>
 
int main()
{
	std::vector<std::vector<int> > voi(10, std::vector<int>(10));
	
	for(int x = 0 ; x < 10 ; ++x)
		for(int y = 0 ; y < 10 ; ++y)
			voi[x][y] = x*y;
 
	for(int x = 0 ; x < 10 ; ++x)
	{
		for(int y = 0 ; y < 10 ; ++y)
			std::cout << std::setw(2) << voi[x][y] << " ";
			
		std::cout << std::endl;
	}
}
 
Outputs...
 
 0  0  0  0  0  0  0  0  0  0 
 0  1  2  3  4  5  6  7  8  9 
 0  2  4  6  8 10 12 14 16 18 
 0  3  6  9 12 15 18 21 24 27 
 0  4  8 12 16 20 24 28 32 36 
 0  5 10 15 20 25 30 35 40 45 
 0  6 12 18 24 30 36 42 48 54 
 0  7 14 21 28 35 42 49 56 63 
 0  8 16 24 32 40 48 56 64 72 
 0  9 18 27 36 45 54 63 72 81 

Open in new window

Hi evilrix:,

Whilst you were on your way home, I've been working on creating a second source file with minimium spaghetti in it for you.....

Still working on it, who knows, perhaps tidying up the spaghetti might fix stuff in the mean time.

No worries :)

Take a look at my example too though... you might have a eureka moment :)