• C

# dividing matrix into small blocks

hi i want to divide a matrix of 512 x 512 dimension into small matrices of 8 x 8  blocks . how can i do it???
thanks
###### Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Commented:
are you kidding that you can't do it yourself? ok, ok .. here's some code. since you haven't specified any details i don't care about indexes. if they don't fit you, change the code. but you get the idea.

int i, j, k, l ; // just indexes
int old_matrix [512][512] ;
int new_matrix [64][64][8][8] ; // there will be 64x64 array of 8x8 matrices

// one possibility .... do it the hard way
for (i=0; i<64; i++)
for (j=0; j<64; j++)
for (k=0; k<8; k++)
for (l=0; l<8; l++)
new_matrix [i][j][k][l] = old_matrix [(i*8)+k][(j*8)+l] ; // you can replace the (i*8) by (i<<3) and same with j. should be little faster

// second possibility .. bit faster i think. but the indexes could be damaged. you have to check yourself
memcpy (new_matrix, old_matrix, 512*512*sizeof(int)) ;

S.

Experts Exchange Solution brought to you by

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Commented:
> you can replace the (i*8) by (i<<3) and same with j. should be little faster

Most compilers will generate the same code in both cases. I'd stick with the multiplication, because it is easier to read.
Commented:
Yes, rstaveley, youre right. I use VC++ and i thought that compiling it in 'debug' mode without any optimizations would make different code, but it surprisingly created exactly same thing.

Anyway, have you checked the difference between (i*9) and ((i<<3)+i) ?? The first thing compiles to 'imul ecx,ecx,9' which i would expect, but the second one compiles into:

mov         edx,dword ptr [ebp-4]
mov         eax,dword ptr [ebp-4]
lea         ecx,[eax+edx*8]

which is completely different from what i've expected (well, not completely, but almost completely). I've done some research on different mlutipliers, checked the assembler code. It really looked interesting.

And what's the point? That when i tried to compile it in 'release' mode using all optimizations the resulting assembler code was exactly the same. So don't use any shifting or other'shortcuts', because the compiler will recognize it and compile the code its own way no matter how you try to fool it.

S.
Commented:
If I optimise ((i<<3)+i) on VC7.1 with /Ox, I get...

lea      eax, DWORD PTR [eax+eax*8]

...which makes more sense, because you get the whole shooting match in one 80386 instruction. Isn't it funny, though, to see it using a multiplication instead of the well-intended shift? The main trouble with attempting to help the compiler, is that it is tempting to think that we are still writing code for the 8086 instruction set :-)
###### It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
C

From novice to tech pro — start learning today.