?
Solved

Closest Pair: Divide & Conquer

Posted on 2007-07-26
15
Medium Priority
?
5,053 Views
Last Modified: 2011-10-03
Ok, what I'm trying to do is take my function which is basically closest pair, and change it from the brute force method it currently is and turn it into a divide and conquer approach.  I'm having some issues understanding the descriptions of the problem I've found so far, so I was hoping to get some pseudo-code for c/c++ or possibly some code.  In the end this should get the minimum distance between any 2 points in an array of 3D points.  What I currently have works, but it's basically O( (N*(N+1))/2 ) and I've read with
divide and conquer it can be made into O( N*log(N) ).

I'm trying to write this in only C, so try to keep C++ stuff out for now.

typedef struct Vect // Vector struct for positions
{
      float x;
      float y;
      float z;
} Vect_t;

float findMinDistance_old(const int nPlayers, const Vect_t *playerArray)
{      // O( (N *(N + 1)) / 2) )
      Vect_t tmpVect;
      float tmpDist, minDist = 0.0f;
      int i, j;

      if(nPlayers <= 1) return 0; // error
      else if(nPlayers == 2)
      {
            // need this check because my loops won't work for 2 players
            tmpVect.x = playerArray[0].x - playerArray[1].x;
            tmpVect.y = playerArray[0].y - playerArray[1].y;
            tmpVect.z = playerArray[0].z - playerArray[1].z;

            tmpDist = (tmpVect.x * tmpVect.x)
                              + (tmpVect.y * tmpVect.y)
                              + (tmpVect.z * tmpVect.z);

            return (float)sqrt(tmpDist);
      }

      for(i = nPlayers; --i; )
            for(j = i; --j; )
            {
                  tmpVect.x = playerArray[i].x - playerArray[j].x;
                  tmpVect.y = playerArray[i].y - playerArray[j].y;
                  tmpVect.z = playerArray[i].z - playerArray[j].z;

                  tmpDist = (tmpVect.x * tmpVect.x)
                                    + (tmpVect.y * tmpVect.y)
                                    + (tmpVect.z * tmpVect.z);

                  if(tmpDist < minDist || minDist == 0) minDist = tmpDist;
            }

      return (float)sqrt(minDist);
}
0
Comment
Question by:krusho
  • 8
  • 6
15 Comments
 
LVL 85

Expert Comment

by:ozo
ID: 19577098
0
 

Author Comment

by:krusho
ID: 19578002
Yeah, I've seen that link as well as a couple others.  I guess the part that confuses me most about the description on that page is, how do I determine where l should be?  and how do I adapt that to a 3D point?
0
 
LVL 3

Accepted Solution

by:
Kashra earned 2000 total points
ID: 19579483
There are a few things to note in the divide and conquer approach. The first is that this is, by definition, a recursive algorithm. You will be hacking the space into equally sized parts until you have only two points in each division and work your way up from there. Second, this only works well if you are able to divide your space relatively evenly each time.

If you expect your points to be clustered and not have an oddly shaped distribution, you can make some assumptions about where you draw your dividing plane (its a plane, not a line, because we're in 3D). You could, for example, just take the midpoint of the longest dimension and draw your plane through that. This is a crude solution, but also inexpensive.

For more information about how to divide points in 3D, you should read about Axis-Aligned-Bounding-Boxes (AABB for short) or their more advanced cousins, oriented-bounding-boxes (OBB). A lot of introductory 3D game programming material discusses these, as they're often used in collision detection (which is a very similar problem and uses a divide and conquer approach as well).

As for the recursion, the logic should look something like this:
DivideAndConquer:
    IF there are only two points, measure distance between them
    IF there is only one point, return 0
    IF there are more than two points:
        Draw dividing plane (using longest dimension or whatever approach you wish)
        Sort points into two bins, divided by plane
        Call DivideAndConquer twice to retrieve shortest distance between points in each bin
        Determine which of the two distances is shorter
        Search for any points that are within that distance of the dividng plane and compare distances between points that are from separate bins.
        Return either the shortest distance from the recursive calls or the shortest distance from the point comparisons between bins, whichever is smaller.

Of course, when you write the thing you can return as much or as little information as you need, and some amount of point sorting can speed things up when you're looking to draw your dividing line and searching for nearby points, but that's pretty well discussed in the link above.
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 

Author Comment

by:krusho
ID: 19583254
Ok, so I've been working on this I got it somewhat working for 2D, but it seems to crash when I give it any more than 4 players.  It's getting a stack issue.  Do you have any tips on fixing this and making my code match a little more of what you are saying?

float findDistance(const Vect_t *v1, const Vect_t *v2)
{
      Vect_t tmpVect;
      float tmpDist;

      tmpVect.x = v1->x - v2->x;
      tmpVect.y = v1->y - v2->y;
      tmpVect.z = v1->z - v2->z;

      tmpDist = (tmpVect.x * tmpVect.x)
                        + (tmpVect.y * tmpVect.y)
                        + (tmpVect.z * tmpVect.z);

      return tmpDist;
}

float findMinDistance(const int nPlayers, const Vect_t *playerArray)
{
      Vect_t tmpVect;
      int i, j;
      float tmpDist;
      float left, right, middle;
      //      Draw dividing plane (using longest dimension or whatever approach you wish)
      float divide = (playerArray[(nPlayers>>1)+1].x - playerArray[(nPlayers>>1)].x);

      switch(nPlayers)
      {
            case 0:
            case 1:
                  return 0;

            case 2:
                  tmpVect.x = playerArray[0].x - playerArray[1].x;
                  tmpVect.y = playerArray[0].y - playerArray[1].y;
                  tmpVect.z = playerArray[0].z - playerArray[1].z;

                  tmpDist = (tmpVect.x * tmpVect.x)
                                    + (tmpVect.y * tmpVect.y)
                                    + (tmpVect.z * tmpVect.z);

                  return (float)sqrt(tmpDist);

            default:                        
                  //      Call DivideAndConquer twice to retrieve shortest distance between points in each bin
                  left = DivideAndConquer(0, (nPlayers >> 1), playerArray);
                  right = DivideAndConquer((nPlayers >> 1), nPlayers, playerArray);

                  //      Determine which of the two distances is shorter
                  if(left < right)
                        tmpDist = left;
                  else
                        tmpDist = right;

                  //      Search for any points that are within that distance of the dividng plane and compare distances between points that are from separate bins.
                  for(i = 0; i < nPlayers; i++)
                  {
                        if(playerArray[i].x < divide)
                        {
                              for(j = 0; j < (nPlayers>>1); j++)
                              {
                                    middle = findDistance(&playerArray[i], &playerArray[j]);
                                    if(tmpDist > middle && middle != 0)
                                          tmpDist = middle;
                              }
                        }
                        else
                        {
                              for(j = (nPlayers>>1); j < nPlayers; j++)
                              {
                                    middle = findDistance(&playerArray[i], &playerArray[j]);
                                    if(tmpDist > middle && middle != 0)
                                          tmpDist = middle;
                              }
                        }
                  }

                  return (float)sqrt(tmpDist);
                  break;
      }
}

float DivideAndConquer(const int start, const int end, const Vect_t *playerArray)
{
      Vect_t tmpVect;
      float tmpDist, minDist = 0.0f;
      float cut, left, right, dist, dist2;
      int i, j, nPlayers = (end - start);

      switch(nPlayers)
      {
            case 0:
            case 1:
                  //IF there is only one point, return 0
                  return 0;

            case 2:
                  //IF there are only two points, measure distance between them
                  tmpVect.x = playerArray[start].x - playerArray[end].x;
                  tmpVect.y = playerArray[start].y - playerArray[end].y;
                  tmpVect.z = playerArray[start].z - playerArray[end].z;

                  tmpDist = (tmpVect.x * tmpVect.x)
                                    + (tmpVect.y * tmpVect.y)
                                    + (tmpVect.z * tmpVect.z);

                  return tmpDist;

            default:
                  //IF there are more than two points:

                  //      Draw dividing plane (using longest dimension or whatever approach you wish)
                  cut = (playerArray[start + (nPlayers>>1) + 1].x - playerArray[start + (nPlayers>>1)].x);

                  //      Call DivideAndConquer twice to retrieve shortest distance between points in each bin
                  left = DivideAndConquer(start, (nPlayers>>1), playerArray);
                  right = DivideAndConquer((nPlayers>>1), end, playerArray);

                  //      Determine which of the two distances is shorter
                  if(left < right)
                        dist = left;
                  else
                        dist = right;

                  //      Search for any points that are within that distance of the dividng plane and compare distances between points that are from separate bins.
                  for(i = start; i < end; i++)
                  {
                        if(playerArray[i].x < cut)
                        {
                              for(j = start; j < (nPlayers>>1); j++)
                              {
                                    dist2 = findDistance(&playerArray[i], &playerArray[j]);
                                    if(dist > dist2 && dist2 != 0)
                                          dist = dist2;
                              }
                        }
                        else
                        {
                              for(j = (nPlayers>>1); j < end; j++)
                              {
                                    dist2 = findDistance(&playerArray[i], &playerArray[j]);
                                    if(dist > dist2 && dist2 != 0)
                                          dist = dist2;
                              }
                        }
                  }

                  return dist;
                  break;
      }
}
0
 
LVL 3

Assisted Solution

by:Kashra
Kashra earned 2000 total points
ID: 19594188
A few comments about recursion, looking at your code.

1. findMinDistance and DivideAndConquer have much the same logic. You don't need two functions for the recursion, just one.

2. You have to be careful to trap your "return 0" cases, as you move back up your recursion tree. As it stands, if you end up with a node with 3 points in it and split it into bins of 2 and 1, then one call will return the distance between 2 points and the other will return "0". However, since you don't trap for 0, you end up with the function comparing the distance and the 0 to see which is smaller, and the zero will always be. The desired behavior, actually, is for it to say "zero is unreasonable, that node must not have two points, so the minimum distance is either the first bin that reported a real distance, or some distance between the two bins".

3. Your current method of sorting points into bins ONLY works if the points are already sorted by the coordinate. If you just cut the array in half, and you haven't done any sorting already, there's no guarantee that the points are actually "in order", spatially. If they are sorted, then you know that everything on one half is spatially separated from everything on the other half, and things are fine. But if they aren't, I would expect trouble.

4. The point of doing all this work is so that you don't have to call the "findDistance" function on -every- combination of points, because that would boil down to the brute force method. In your innermost loop, for when you're trying to find points that are within distance of the dividing plane, you end up just calling findDistance on all the point pairs. What you want to do is compare x values or something simple to determine if the points are close enough to be relevant, and THEN call findDistance only if they are. The hope is that you'll eliminate as many calls to findDistance as possible, as this is the calculation that scales poorly with many points.

Hope this helps!
0
 

Author Comment

by:krusho
ID: 19596831
Ok, I handled 1 & 2 with this code.  3 is fine because as you can see I sort it in the x myself.  I still am not sure if I am doing your 4th point the best way, if you could help with that.  And also, it seems when I run my function with more than 4 players it causes it to crash because somehow it keeps getting a negative number of players.  it has something to do with the (nPlayers>>1) that I use for end in my function, any ideas?

float findDistance(const Vect_t *v1, const Vect_t *v2)
{
      return (((v1->x - v2->x) * (v1->x - v2->x)) + ((v1->y - v2->y) * (v1->y - v2->y)) + ((v1->z - v2->z) * (v1->z - v2->z)));
}

float findMinDistance(const int nPlayers, const Vect_t *playerArray)
{
      Vect_t* playerArray_sorted = malloc(sizeof(Vect_t) * nPlayers);

      quickSort_x(playerArray, nPlayers, playerArray_sorted);
      
      return (float)sqrt(DivideAndConquer(playerArray_sorted, 0, nPlayers));
}

float DivideAndConquer(const Vect_t *playerArray, const int start, const int end)
{
      Vect_t cut;
      float left, right, dist;
      int i, j;
      int nPlayers = (end - start);

      switch(nPlayers)
      {
            case 0:
            case 1:
                  //IF there is only one point, return 0
                  return 0;

            case 2:
                  //IF there are only two points, measure distance between them
                  return findDistance(&playerArray[start], &playerArray[(end - 1)]);

            default:
                  //IF there are more than two points:

                  //      Draw dividing plane (using longest dimension or whatever approach you wish)
                  cut.x = playerArray[(nPlayers>>1)].x;
                  cut.y = 0;
                  cut.z = 0;

                  //      Call DivideAndConquer twice to retrieve shortest distance between points in each bin
                  left = DivideAndConquer(playerArray, start, (nPlayers>>1));
                  right = DivideAndConquer(playerArray, (nPlayers>>1), end);

                  //      Determine which of the two distances is shorter
                  if(left < right && left != 0)
                        dist = left;
                  else
                        dist = right;

                  //      Search for any points that are within that distance of the dividng plane and compare distances between points that are from separate bins.
                  for(i = start; i < end; i++)
                  {
                        cut.y = playerArray[i].y;
                        cut.z = 0; // temp

                        if( (playerArray[i].x <= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )
                        {
                              for(j = start; j < end; j++)
                              {
                                    if( (playerArray[j].x >= cut.x) && (findDistance(&playerArray[j], &cut) < dist) )
                                    {
                                          left = findDistance(&playerArray[i], &playerArray[j]);
                                          if(left < dist && left != 0)
                                                dist = left;
                                    }
                              }
                        }
                        else// if( (playerArray[i].x >= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )
                        {
                              for(j = start; j < end; j++)
                              {
                                    if( (playerArray[j].x <= cut.x) && (findDistance(&playerArray[j], &cut) < dist) )
                                    {
                                          right = findDistance(&playerArray[i], &playerArray[j]);
                                          if(right < dist && right != 0)
                                                dist = right;
                                    }
                              }
                        }
                  }

                  return dist;
                  break;
      }
}
0
 
LVL 3

Assisted Solution

by:Kashra
Kashra earned 2000 total points
ID: 19597110
To help minimize calls to findDistance:

if( (playerArray[i].x <= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )

Its not necessary to check the distance between a point and the "cut" point. All that's really necessary is to make sure that point i is within dist of the dividing plane. All points that are within dist of the plane should be between cut.x - sqrt(dist) and cut.x + sqrt(dist)

That boils the logic down to:
// Calculate once per function call
float xsqrd = cut.x * cut.x;
float minx = xsqrd - dist;
float maxx = xsqrd + dist;

// Calculate once per point
float isqrd = playerArray[i].x * playerArray[i].x;
float jsqrd = 0, tempDist = 0; // used later
if ( isqrd >= minx  &&  isqrd < xsqrd ){
    // This point is on one half of the dividing plane, and within dist of it. We have to compare it to points only on the OTHER side of the dividing plane.
    for (j = start; j < end; j++){
        jsqrd = playerArray[j].x * playerArray[j].x;
        if ( jsqrd >= xsqrd  &&  jsqrd <= maxx ){
            // This point is on the OTHER half of the diving plane, and within dist of it. It is eligible for comparison, so we have to compute the full distance.
            tempDist = findDistance(&playerArray[i], &playerArray[j]);
            if (tempDist < dist) // don't have to check for 0 as findDistance doesn't return 0 on error
                dist = tempDist;
        }
        // Don't have to check any points that are on the same side as i (those were already checked by the recursion). Also don't have to check any points that are more than dist away from the plane, on the other side. So no need for an "else" here.
    }
    // Don't need an "else" here, either. No point in comparing back across the diving plane, as all the combinations should have been covered in the forward case.
}
// Don't need an "else" here, because if its not within dist of the plane, it has no chance of being the shortest distance.


As for the problem with adding more than 4 players, I can't find anything wrong, at a glance. Have you tried replacing your bitshift operations with simple division? Bitshifting is a level of optimization I wouldn't recommend until you're completely sure the rest of the code works.
0
 
LVL 3

Expert Comment

by:Kashra
ID: 19597121
Ugh, I was trying to be quick and dirty with the math, but as a side note, comparing the squared values doesn't actually work unless the actual values are all positive. Its still possible to do this "correctly", but you might have to take a square root of the dist once per function instead of squaring the x values.
0
 

Author Comment

by:krusho
ID: 19647703
I'm still working on this I just got busy with a couple other things.  I won't forget to give out my points when it's working.  I'll post some new code later this week.
0
 

Author Comment

by:krusho
ID: 19650051
Ok, I fixed the crash and it correctly finds the minimum distance.  The question is this the O( N log(N) ) ?
0
 

Author Comment

by:krusho
ID: 19650054
forgot to post the code

float findDistance(const Vect_t *v1, const Vect_t *v2)
{
      return (((v1->x - v2->x) * (v1->x - v2->x)) + ((v1->y - v2->y) * (v1->y - v2->y)) + ((v1->z - v2->z) * (v1->z - v2->z)));
}

float findMinDistance(const int nPlayers, const Vect_t *playerArray)
{
      Vect_t* playerArray_sorted = malloc(sizeof(Vect_t) * nPlayers);

      quickSort_x(playerArray, nPlayers, playerArray_sorted);
      
      return (float)sqrt(DivideAndConquer(playerArray_sorted, 0, nPlayers));
}

float DivideAndConquer(const Vect_t *playerArray, const int start, const int end)
{
      Vect_t cut;
      float left = 0, right = 0, dist = 0;
      int i, j;
      int nPlayers;

      if(start > end)
            nPlayers = 0;
      else
            nPlayers = (end - start);

      switch(nPlayers)
      {
            case 0:
            case 1:
                  //IF there is only one point, return 0
                  return 0;

            case 2:
                  //IF there are only two points, measure distance between them
                  return findDistance(&playerArray[start], &playerArray[(end - 1)]);

            default:
                  //IF there are more than two points:

                  //      Draw dividing plane (using longest dimension or whatever approach you wish)
                  cut.x = playerArray[(start + (nPlayers / 2))].x;
                  cut.y = 0;
                  cut.z = 0;

                  //      Call DivideAndConquer twice to retrieve shortest distance between points in each bin
                  left = DivideAndConquer(playerArray, start, (start + (nPlayers / 2)));
                  right = DivideAndConquer(playerArray, (start + (nPlayers / 2)), end);

                  //      Determine which of the two distances is shorter
                  if(left < right && left != 0)
                        dist = left;
                  else
                        dist = right;

                  //      Search for any points that are within that distance of the dividng plane and compare distances between points that are from separate bins.
                  for(i = start; i < end; i++)
                  {
                        cut.y = playerArray[i].y;
                        cut.z = 0; // temp

                        if( (playerArray[i].x <= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )
                        {
                              for(j = start; j < end; j++)
                              {
                                    if( (playerArray[j].x >= cut.x) && (findDistance(&playerArray[j], &cut) < dist) )
                                    {
                                          left = findDistance(&playerArray[i], &playerArray[j]);
                                          if(left < dist && left != 0)
                                                dist = left;
                                    }
                              }
                        }
                        else if( (playerArray[i].x >= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )
                        {
                              for(j = start; j < end; j++)
                              {
                                    if( (playerArray[j].x <= cut.x) && (findDistance(&playerArray[j], &cut) < dist) )
                                    {
                                          right = findDistance(&playerArray[i], &playerArray[j]);
                                          if(right < dist && right != 0)
                                                dist = right;
                                    }
                              }
                        }
                  }

                  return dist;
                  break;
      }
}
0
 
LVL 3

Assisted Solution

by:Kashra
Kashra earned 2000 total points
ID: 19658232
You can still make things simpler. The idea is to reduce the calls to findDistance to ONLY the points you want to compare:

if( (playerArray[i].x <= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )

can become

float idist = cut.x - playerArray[i].x;
if( (playerArray[i].x <= cut.x) && ( (idist * idist) < dist) )

You can do the same with the comparison for j. Mathematically, this is the same as what you did in your code, except it doesn't waste time with the other coordinates that you know don't matter.

Also, since you know you've sorted the points by the x axis, you also know that if point "i" is less than cut.x, no point before point "i" could be greater than cut.x. So your inner loop of point "j" can start at i+1 instead of the start of the vector.

Finally, you're doing twice as much work as you need to. The whole portion:

else if( (playerArray[i].x >= cut.x) && (findDistance(&playerArray[i], &cut) < dist) )
{ ... }

Is unnecessary, because you've already compared every possible pair in your first loop. If I have 5 points on one side of the plane and 3 points on the other, the first loop compares each of the five points to each of the 3. The second loop would compare each of the 3 points to each of the 5, which should produce the exact same distances, in reverse order. Twice the calculation for no gain.
0
 

Author Comment

by:krusho
ID: 19658648
Appreciate all the help, I'm guessing now it should be in O( N log( N ) ).  What would be the steps to convert this to handle 3D points, I know you were talking about Axis-Aligned Bounding Boxes (AABB), but not exactly sure the steps to getting this working.

                  for(i = start; i < end; i++)
                  {
                        cut.y = playerArray[i].y;
                        cut.z = 0; // temp

                        idist = (cut.x - playerArray[i].x);
                        if( (playerArray[i].x <= cut.x) && ((idist * idist) < dist) )
                        {
                              for(j = (i + 1); j < end; j++)
                              {
                                    idist = (playerArray[j].x - cut.x);
                                    if( (playerArray[j].x >= cut.x) && ((idist * idist) < dist) )
                                    {
                                          left = findDistance(&playerArray[i], &playerArray[j]);
                                          if(left < dist && left != 0)
                                                dist = left;
                                    }
                              }
                        }
                  }
0
 
LVL 3

Expert Comment

by:Kashra
ID: 19658868
By my understanding, these are 3D points. You have x, y, and z. Right now, it should work for 3D points.

The only thing to do now is to get "smarter" about where you decide to draw your dividing plane. As it stands, you're just arbitrarily choosing the X axis and drawing a plane through the center point. You could be a little smarter by just finding which axis is the longest (X, Y or Z) and using that one, instead.

Or you could be even more advanced and start drawing AABB around your points. That's a level of complication I'd rather not get into, here. Its been explained elsewhere and its not necessarily going to help your speed. Its a question of trading one set of calculations for another. If your points are strangely distributed, drawing AABB might help reduce the number of recursions you have to take, at the expense of doing all the math to figure out how to draw the AABB. If your points aren't strangely shaped, then drawing AABB's probably won't make much of a difference.
0
 

Author Comment

by:krusho
ID: 19658975
Oh you're right, I was just testing it in 2D because I wasn't sure if it would work in 3D yet.  Appreciate the help.
0

Featured Post

Free Tool: IP Lookup

Get more info about an IP address or domain name, such as organization, abuse contacts and geolocation.

One of a set of tools we are providing to everyone as a way of saying thank you for being a part of the community.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Although it can be difficult to imagine, someday your child will have a career of his or her own. He or she will likely start a family, buy a home and start having their own children. So, while being a kid is still extremely important, it’s also …
Computer science students often experience many of the same frustrations when going through their engineering courses. This article presents seven tips I found useful when completing a bachelors and masters degree in computing which I believe may he…
Simple Linear Regression
Loops Section Overview

840 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question