Considering 93 percent of companies file for bankruptcy within 12 months of a disaster that blocked access to their data for 10 days or more, planning for the worst is just smart business. Learn how Acronis Backup integrates security at every stage

I'm looking for a very efficient algorithm to find all the local maximum in a 2D matrix of float values.

It has to be the fastest possible. So I guess it should use the minimum number of comparisions.

Thanx.

It has to be the fastest possible. So I guess it should use the minimum number of comparisions.

Thanx.

Experts Exchange Solution brought to you by

Enjoy your complimentary solution view.

Get this solution by purchasing an Individual license!
Start your 7-day free trial.

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

I thought about 1D search but you don't gain anything with this.

What would be the difference from comparing the 4 closes neightbours and all the eight connected cells ?

I think you have to make sur the local maximum is bigger than all eight neightbours.

Finaly this doesn't solve the case where neightbours cells have the same value. The algorithm should be able to handle this case too !

It's a tough one. Sorry I can't increase the points, it's all I have. I think it's worth more.

I already have an algorithm that is quite efficient but I would like to know if there is a faster one !

1 1 1

1 2 1 Simple extrema

1 1 1

2 1 2

3 2 3 2 is a local maximum in one dimension, a minimum in the

2 1 2 other, the 3's are real maxima.

1 2 1

2 2 2 is the central 2 a maximum? And what about the 2's

1 2 1 on the edge?

1 1 1 1 1

1 2 2 2 1 Are all 2's a local maximum, or only the center one?

1 2 2 2 1

1 2 2 2 1

1 1 1 1 1

How do you handle these situation? Simple 1D search seems to fail on the second one, comparing adjacent cells fails on the last one.

.luc.

Thus a good algorithm would be test all neightbours, but when one of them has a lower value, it can be skipped in the local max search. Then we go on we one of the neightbours with higher value. We could pick as next cell candidate the cell with the highest value. We keep exploring like that, eliminating all neightbours with lower value from the set of cells left to explore. When we reach a cell were all the neightbours have lower value we then have a local max. We then restart the process from the next candidate cell of the set. When the set becomes empty we have all local max.

Another use of this rule would be to do a first fast scan of the matrix to remove the maximum number of cells.

In this scan we only examine cells at a distance of two cells of each other.

Thus the sequence (1,1) (1,4) (1,7) ... (4,1) (4,4)...

We then flag all neightbours with a higher value as local max candidate for further study. This way we can eliminate the most cells.

None of the two above methods will handle flat (same value) local maxes covering more than one cells.

There are other rules we could use.

if a < b and b < c then a < c. So we could avoid the need to test a < c. But this is apparently more complex to take into account.

Does these ideas give you a hint for a better one ?

This looks like a "standard" problem. There should be good solutions around I guess. That's why I posted this probelem here to the experts.

Thanx for you help.

yes this is the flat local max problem I noted.

The first particuliar problem can be avoided by comparing with all 8 neightbours to determin if it's local max.

For the other problems you underline the solution I have seen in one solution to this problem uses a sliding window to test for local maxima. But it would only test the 4 neightbours. Not the eights.

The most important is that such local max does not escape our search. We can still filter out in a second step such particuliar cases. only local max with at least one neightbour with same value is a potential candidate for such particuliar case. This can be flagged when we find a cell with no neightbours with higher value but at least one neightbour we a same value.

so tuning for optimal speed may depend on the particular situation.

Does your matrix fit in memory? Does one row (or 3) fit in cache?

What's the relative speed of a floating compare on your machine compared to an array access?

My first instinct would still be to run a 3 point max filter in one direction,

then a 3 point max filter of that in the other direction,

then take anything that's equal to the resulting 9 max as a local maximum.

Or, if those maxima are expected to be sparse, perhaps only run the second filter

(in the 'hard' direction for sequential access)

on potential candidates screened from the first pass.

Have you measured your current algorithm to see where it's taking the most time?

http://www.netlib.org/

in parallel with the first pass, gaining better locallity, and immediatly skipping

those cells which don't need a second pass.

I also wondered if, on machines where a floating compare is slow,

you might get a speed up by converting to int on the first pass.

But if you need to do much shuffleing to preserve the collating sequence,

you probably don't do enough comparisons for the gain to be worth it.

If comparisons are your metric, it looks like it can be done in 5n^2 worst case.

With <3n^2 expected for random values (well, one of those compares could be a boolean test)

Had you gotten any better than that?

For example using the easiest edge detection algorithm several times to find out the highest point.

You could use a window of size 4 with the corresponding values:

0 1 0

1 -4 1

0 1 0

You multiply each data by this window and you have all the possible case...at least I think so :)

By using this technique you don't calculate the 1 and the last row/column :P

Moreover you could use it on every 2 point and then move it so that every 2 pass you have checked all the points.

I'm not sure about the last one, but you could try...

CpH

For more complex/fastest edge detection algorithm, you should check out the web...

Apparently it won't give any hint on the maxima state of the examined point.

Sorry but I can't accept this answer.

since you have to find ALL the local max points, your problem is solved this way (i don't think there are faster ways):

1. take your 2D array space and transform it to a graph representation; since each vertex is connected to its 8 neighbours, and i guess you have to deal with spaces bigger than 8x8, you may consider yours as a 'sparse' graph, and the graph may be implemented with linked lists (for each vertex, build a list containing its 8 neighboors);

2. apply a basic 'search' algorithm on the graph to visit all the vertices (breadth-first or depth-first search, as you prefer); while visiting, check for the local maximum condition.

I could write the code for you, but this sounds a bit a home assignment ;)

Feel free to ask more...

cheers, julio

be more asking how to do it at all, rather than optimization tweaking,

which suggests to me interest in a specific implementaion of a specific application.

Then again, perhaps meessen is bluffing in claiming to already have a fast algorithm.

(which I'd expect to be faster (though of the same order) than explicitly tranforming it into a graph anyway)

It still looks to me like a cross product of 3 point max filters

is the most promising aproach, but I still haven't heard how

that compares to the

aledged existing algorithm, (so maybe there isn't an actual application after all:-)

The basic code seems simple--shorter than this comment.

But to really optimize for minumum comparisons, all the little bookkeeping

details to maintain partial information can get tedious,

and it's unclear how much that really gains anyway, if, as I suspect

(though meessen has privided no details), the memory access operations are

as much of a bottleneck as the comparions.

What will you gain by building your graph ?

How will you test for local maxima in your graph ?

You will have to compare each points with all it's neightbours (8). But this the same I do in matrix. So converting the data into a graph is pure loss of time and spare. There is apparently no benefit in doing so unless I missed a point in your proposal...

this is no assignment. As you note it, it's an optimisation question. As I may have wrongly stated in the beginning, there is no ultimate perfect algorithm I know of. There are two different tracks that I explained here before. I have no implementation to compare execution time and allowing me to say your algorithm is worse or better than any other.

I thought this question had already a clear answer. Apparently not.

Here is more detail on the data sotred in the matrix. The current matrix size is 256x256 but there are tousands of such matrix to examine. This is why speed is relevant. These matrix contains noise (many small maximas) and a few higher value maxima we are interrested in.

About your 3 point max filter proposal. I thougth of this idea in the first place. The currently used algorithm is very similar to this one, except that it will filter noise with a one dimension convolution before doing a sort of 3 point max filter in the other dimension. This filtering removes many useless maxima.

This work is part of an R&D project and we don'trealy know how much significant maxima we will meet. There may be 10, 100 or 1000 spread over 65536 cells.

in fact this 256x256 matrix is a summation and scaling of a set of matrixes representing a total of 165000 cells.

Each of these smaller matrixes covers parts of the total space represented by this 256x256 matrix. The cell resolution is not always the same so adding up these matrixes requires some significant processing which also bring it's information loss.

The idea behind the current method was that searching maxima in 65536 cells is faster than searching in 165000 cells. But building this 256x256 matrix cost already 165000 operations.

So I was looking for a faster algorithm allowing to search in this 165000 cell space and avoiding to build this search matrix.

From current estimations, building all the set of matrixes represent 30% of the total processing time. Building the 256x256 matrix uses also 30% of the time and searching the maxima takes also aproximately 30% of the time.

In summary 90% of the processing time is spent is searching maxima. Opptimising this part of processing will yield the most gain in processing time.

Apparently it looks like you simple and fast approach might be the way to go unless we hear a better proposal. But apparently there is none.. so keep in touch ;-)

i'm very sorry :(

i went too fast on the question and on the comments;

still, i find the problem very interesting, and would like to keep participating, if you don't mind.

--

Sure, building a linked list representation is useless (stupid maybe;), since there is already a matrix representation; still, thinking of it as a graph could be of help.

Actually, my approach was very near to the one in meessen's very first comment, while i cannot see where there could be problems with 'flat' maxima, neither with applying the (a<b, b<c => a<c) property, which is build in the search algorithm itself.

Farther, a 'graph' approach shouldn't take benefit from a pre-scanning of the matrix - unless some heuristics suggest it, which doesn't seem the case.

In general, it seems to me that this problem, at a logical level, cannot but be reduced to this kind of approach. Optimization, then, becomes a matter of implementation.

Rgds, julio

You apply the 'window' to each values in the initial matrix.

You add the four neighbours and substract the actual value four times. At this point you can spot if it is a maxima or not.

The difference will tell you if the value you are checking is:

- the same (within a certain domain) than his neighbour.

- Smaller than his neighbour

- Greater than his neighbour

just by checking the size of this operation.

Now what you have to do is to keep track of the greatest values found after this pass by creating an other matrix.

Apply the same idea on this matrix, n times until you have find the maxima.

Note that you may found more than one maxima, but this depend on your initial matrix.

Is it better like that?

CpH

It sounds like you really want significant maxima, where 'significant' is still undefined.

And the actual maxima search of your initial question is only 1/3 of the time.

(the fact that the 3 steps take approximately equal time might also suggest

that memory access time may be more significant than compute time)

So could you explain how the 165000 cells are combined into the 256*256?

What noise filter do you use to remove useless maxima?

And how you determine which maxima are interesting?

(for this much work, perhaps the idea of convering to int, if they're

significantly faster than float may pay off after all?)

the method you suggest to test for maxima will yield a wrong result in the following case:

6

1 5 1

1

This is why I said that your method compares the average of the four neightbours with the cell. As you can see, this test may fail in some case.

Sorry, but I can't grade your answer,

ozo, the more I think, the more it looks like you provided the best answer up to now.

First I wil answer your questions.

Ok lets go into more detail of it'sapplication.

The data is the output of a particle detector at CERN.

I'm just looking if it is possible to optimize existing algorithm and I believe it is possible. But this is what I want to verify.

To summarize, the output, is a set of matrixes representing different detector regions around the particles collision point. It's content is the measured energy. Energy value may vary alot across the matrix, thus for now, as far as we know, we have to stick to floats.

The detector regions are organised in multiple layers and regions. Different matrixes may may correspond to different physical resolutions. This 256x256 matrix is currently called the combined matrix. This matrix correspond to a specific cell resolution.

Thus when building this combined matrix we addup all matrix content. For some matrix it may mean spread it's energy across multiple combined matrix cells or otherwise merge their energy.

Cells may be rectangular. Thus for some matrix it means split in one dimension and merge in the other.

As you can see, building this combined matrix is not a quick task.

About the noise filter. The actual noise filter is a simple average of the two neightbour cells in one dimension when looking for maxima in the other. This is a simplified explanation. In fact it works with a window, sum in one dimension and look for maxima in the other.

To select the interresting maxima, current solution is to define a threshold. The sum of all energy in the window is compared to an empiricaly defined threshold.

They use a weired test to check for maxima. It would be complex to explain but it sounds unefficient to me. Beside considering only the 4 closest neightbours yields a wrong result if one of the four untested neightbours has a higher value.

Apparently it happens some time, in which case the program returns two maximas where there is only one in fact.

Now to go back to your proposal.

I tested the following. Search maxima in one dimension and filter them out by use of the threshold. Then examine closely all other neightbours.

Thus as you suggested it could be possible to search for maxima by a quick search in one dimension as heuristic and then when one is found do a look around which is most probably already a good hit at this point. In my case this looks like the fastest method. Using noise filtering and threshold selection as is currently used would be Ok.

So what about proposing this method as an answer ?

#define n 256

int matrix[n+1][n+1];

int max[3][n+1];

char map[n];

int max3(int i,int j,int k){

int m=i;

if( j>m ){ m=j; }

if( k>m ){ m=k; }

return m;

}

main(){

int i,j;

int m,m0,m1,m2;

srandom(time(0)+getpid());

for( i=0;i<n;i+=1 ){

for( j=0;j<n;j+=1 ){

matrix[i][j] = random()%100;

printf("%3d",matrix[i][j])

}

printf("\n");

}

for( i=0;i<=n;i+=1 ){

int i3 = i%3;

m=m0=m1=m2=matrix[i][0];

for( j=0;j<n;j+=1 ){

m0=m1;m1=m2;m2=matrix[i][j

m = max3(m0,m1,m2);

max[i3][j] = m;

if( map[j] && matrix[i-1][j] == max3(max[0][j],max[1][j\],

matrix[i-1][j] = -matrix[i-1][j];

}

map[j] = (m==m1);

}

}

for( i=0;i<n;i+=1 ){

for( j=0;j<n;j+=1 ){

printf("%3d",matrix[i][j])

}

printf("\n");

}

}

Now, there are several optimiations we may consider.

Instead of the full max3, we could keep track of partial order

information to minimize new comparisons.

And we could try to filter out "uninteresting" maxima so we don't

have to look at the other dimension as often.

So how much does it look like those would really help in practice?

Maybe you'll want to find the 5x5 local maxima instead of the 3x3?

Done carefully, I don't think it should slow the above algorithm very much.

x x x x x

x x 6 x x 0 1 0 0 6 0

x 1 -5/4 1 x multiply it by 1-4 1 = 1 5 1

x x 1 x x 0 1 0 0 1 0

x x x x x

So, 1+6+1+1 -5 = 4 therefore you are not on a maxima.

now with

x x x x x

x x 3 x x

x 1 6 2 x doing the same thing

x x 3 x x

x x x x x

then, 1+3+2+3 - 24 = -16 therefore you spot a maxima

note that the x represent anything. Also note that you skip the first/last row/column. Moving the window to the next value and so on...

Note that all the value within the initial matrix will be test

The only problem that may occur is when the sum of the neighbour is equal to four times the tested value: you'll have 0.

However you can find out if you have to keep them or not with the previous value computed and/or the next one.

CpH

x 6 x

1 5 1

x 1 x

According to your convolution window :

0 1 0

1 -4 1

0 1 0

this would be 6 + 1 + 1 +1 - 4*5 = -11

So we spot a maxima where there isn't.

This is because you test against average off all neightbours.

You can rewrite the previous expression as

(6+1+1+1)/4 -5 = -11/4 and (6+1+1+1)/4 is the average of all neightbours value.

Even if this average is lower than the tested value, one of the neightbours may have a bigger value than the tested cell.

agree ?

OZO I'll check your code. Thanx.

Ignore this message.

<i> this is a test </i>

<pre>

// Test formatted output

if( a == b ) {

c = d;

}

</pre>

Ozo,

I thougth of a more efficient way to test for maxima.

We can find maxima in n operations for a row of n entries.

Your max3 method needs two comparision per cells.

I thought of a finite state automaton searching for maxima in a row.

The idea is the following:

Imagine the vector as a mountain profile where the value represent the altitude. Now you scan from left to right by comparing two neightbour altitude. If a drop follows a raise then

you found a mountain top.

So your machine has two states, in drop tendency or raise tendency. Each state correspond to one loop. In the dropping loop, you keep comparing numbers and as soon as you find a raising neightbours, you jump to the raising loop. In the raising loop you keep comparing numbers, ignoring numbers with same value. As soon as you find a dropping neightbour, you found a maxima or the right edge of a flat maxima. Do whatever you want with it then jump to the dropping loop and process further.

This will do n comparision and some jumps from one loop to the other. Not obvious to implement in C though. ;-)

About filtering, of course the most simple method would be to ignore maxima below a threshold of 3 sigma of the noise distribution for instance. Unless the noise is too important and we need some smarter filtering like the one presented previously.

This method won't require a max5 because it will find flat maxima independtly of it's spanning.

This looks to me the fastest method because it is very simple.

To avoid the need of alternate buffers as you use in your implementation it would be more efficient to do a close examination of 1D maxima as soon as they are found.

If it passes all test print out it's coordinates.

All I have now is simulated data, as soon as we get real data, I'll check noise filtering threshold poblem.

So ozo you own points. You have to propose an answer so I can grant them to you .

Thanx.

'Cos otherwise you will spot the maxima in the previous pass...

Sorry,

CpH

But I didn't think it through to realize it had such a simple implementation for 3

The max5 wasn't to find flat maxima, which as you say can be found anyway,

but to avoid spurious "uninteresting" locals.

But since it seems you want something like "local maxima above the global mean"

"uninteresting" would be a global property, and local filtering

for it would be inappropriate.

But a "sigma above the noise" threshold is local, and seemingly appropriate.

Which brings us back to the "integer" theme.

Energy value may vary alot across the matrix,

but there are only a finite number of distinct floating values,

same as for ints, if we only care about order relationships,

then as long as we preserve that it wouldn't matter whether

they scaled as floats or ints.

But to do things like averaging, or noise thresholding, scaling does matter.

On the other hand, since differences below the noise threshold

are meaningless, perhaps it may be appropriate to round the

values and throw away the low order bits.

If this reduces the size of your float, it may speed up memory access time.

On the other hand, it could increase the number of flat maxima.

But if you're willing to only flag the edges of plateaus,

(and ignore valley plains) maybe you can strengthen your filter

and reduce your work here too?

Another speed up in the matrix building phase may be to only

do merges and never do splits.

doing your max search over a <=256 x <= 256 matrix instead

of always exactly 256 x 256?

About the alternate max[][] buffer, I had originaly thought of

that as [256][256] too, so we could do the state machine optimization

in the other direction too, but then when you said the maxima

were probably sparser than 1000/65536, (instead of the 1/9 that

a random distribution might give) I thought it may be faster to

just track the nearby ones, and go back to look only when we

passed the first filter (3 rows very likely fits in cache,

while 256 may not)

With a sigma filter added, the lazy evaluation short circuit applies even more strongly.

I'll propose another "for sake of argument" routine that only flags

edge maxima and not flat maxima.

I planned to avoid it since it twist information a bit. Convertion into interges could make it worse and cost time anyway.

Searching local maxima in one dimension takes about n operations. Converstion into integer will double this time. You also need to determine the scale. So I'm not sure this is will yield a significant performance increase.

I'm not sure if this combined matrix is needed in later data analysis process, but I believe it could be avoided.

From what I've heard, once maxima are found, physicists go back to the original data for fine grain examination anyway.

float matrix[n+1][n+1];

float max[3][n+1];

char map[n];

float max3(float i,float j,float k){

float m=i;

if( j>m ){ m=j; } if( k>m ){ m=k; }

return m;

}

#include <math.h>

signed char sgn(float a,float b){

/* return (signed char)copysign((double)1.0,

if( a<b ){ return -1;} if( a>b ){ return 1;} return 0;

}

main(){

int i,j;

srandom(time(0)+getpid());

for( i=0;i<n;i+=1 ){

for( j=0;j<n;j+=1 ){

matrix[i][j] = (float)(random()%5);

/* printf("%3.0f",matrix[i][j

}

/* printf("\n");*/

}

for( i=0;i<=n;i+=1 ){

float m,m0,m1,m2;

int i3 = i%3;

signed char d,d0,d1,d2;

m=m0=m1=m2=matrix[i][0];

d0=d1=0;

for( j=0;j<n;j+=1 ){

m0=m1;m1=m2;m2=matrix[i][j

d0=d1;d1=d2;d2=sgn(m1,m2);

switch( d1*3+d2 ){

case -4:{/* < < */; m=m2; d=0;};break;

case -3:{/* < = */; m=m2; d=1;};break;

case -2:{/* < > */; m=m1; d=1;};break;

case -1:{/* = < */; m=m2; d=0;};break;

case 0:{/* = = */; m=m1; d=0;};break;

case 1:{/* = > */; m=m1; d=1;};break;

case 2:{/* > < */; m=(m2>m0)?m2:m0;d=0;};brea

case 3:{/* > = */; m=m0; d=0;};break;

case 4:{/* > > */; m=m0; d=0;};break;

default:{exit(0); };break;

}

max[i3][j] = m;

if( map[j] && matrix[i-1][j] == max3(max[0][j],max[1][j],m

matrix[i-1][j] = -matrix[i-1][j];

}

map[j] = d;

}

}

for( i=0;i<n;i+=1 ){

for( j=0;j<25;j+=1 ){

printf("%3.0f",matrix[i][j

}

printf("\n");

}

}

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trialWe don't really need max3(), a compare of matrix[i-1][j] with m,

and if that's greater, with the other max[!=i3][j] should suffice

(so we could use max[2][n])

But we may also want to keep track of how often in practice we

need to check that 2nd dimension.

Rounding to 1.5 sigma or so may turn more spurious peaks into plateaus,

so that may make the extra work to ignore the center of plateaus pay off

But it would help to know how often that happens.

BTW, is sigma known before the scan? Does it vary locally?

What we do is run the detector without collision so what we see is the noise. The matrixes content at different time is use to compute de noise sigma and average.

Of course the best would be to save the sigma for each cell to be used as a cell specific threshold. This is nice because if one detector cell is a bit broken it's sigma or threshold control can be set to ignore it'scontent.

We have to do a cell specific calibration anyway. So storing a threshold table is not that worse. It would be around 600Kb: realistic

Thanx

Having to pay 23 points for each return to this conversation may

deter anyone but the original questioner and anwerer from participating.

Anyway, it sounds like, while sigma may be known before hand, it does vary locally.

This may make it impractical to use rounding to ignore sub-sigma fluctuations.

(unless all sigma are within a reasonable range)

Without rounding, it becomes pretty meaningless to try to distinguish

exact equality from > or <

that may simplify the sgn test, and reduce the state table.

(distinguishing only < vs >= would create an asymetry between

the left and right sides of a plateau, but in practice, that

distinction would be meaningless anyway, and could be compensated

by adding a gentle sub-sigma slope to the matrix anyway)

I'm not sure how much furthur we can go from here without actual

measurements to see which operations take the most time in practice,

and where various trade offs could most profitably be made.

(BTW, I think I would have prefered an A at the original point value

to a B at the higher value.

Not that I'm suggesting that an A would be appropriate for an

incomplete answer,

But we might both have been better of if you had decided instead

that an incomplete answer should have had fewer points rather than a lower grade:-)

And I couldn't have done it. I guess it is to avoid that as soon as a user get a good response he just reduce points so he doesn't pay it much.

For now I'm pleased by your answer and I think the question can be closed.

Apparently I can't revert the grading and give you an A. Sorry.

It's the first time I use this system.

Thanx alot for the time you spent for me.

Apparently you are rich now ;-)

I didn't mean to sound like I was complaing,

I was just stuck in the mode of thinking about optimizations...

C

From novice to tech pro — start learning today.

Experts Exchange Solution brought to you by

Enjoy your complimentary solution view.

Get this solution by purchasing an Individual license!
Start your 7-day free trial.

The obvious algorithm seems to be to look for the 1D maxima one way,

then look again the other way.