Find k nearest restaurants to customers

Please help me out with following problem -
 Given a set of restaurants (the number being quite large) and its geographical location(x,y) , you are allowed to do an significant amount of pre-processing on it. Now suppose there are x customers located at position (s,t), design an efficient algorithm to find the k nearest restaurants to these customers.

<<This question (but not the answer) may be found at It may also be available elsewhere on the internet, >>    Edited by byundt 12 September 2014 so thread complies with site rules on proper attribution, as well as the Creative Commons rules followed by the web site.

Rohit BajajAsked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Randy PooleCommented:
First what is nearest?  I assume you mean a specified distance?  Also you want this done per customer? which would make the only sense..
Is this an academic assignment?  
The sort of help you can get depends on the answer.

You might want to break you map in regions (states or grid squares for example), and then keep track of adjacent/nearby states or sqares.
Then when you get a request, you would only have to calculate accurate distances for a subset of the restaurants.
You might build a Voronoi diagram.
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

Rohit BajajAuthor Commented:
No its a algorithm question i found one some site.
The locations are coordinates on 2D plane. So by nearest i mean the mathematical distance between two coordinates.
So a naive approach will be to find the distance between each customer and restaurant
But this is a brute force approach , i want a better way to do it.
Randy PooleCommented:
Normally you would have a max distance to do your calculations to narrow your search, otherwise you will be calculating every restaurant.  
Grab your customer x,y (normally lat/long)
Lets say the distance is 10 miles (dtr)
Just calculate what 10 miles is for xMin and xMax as well yMin and yMax
Lets say 1 mile=5 steps (ms)

Now just return any restaurants where their x is between xMin and xMax and y is between yMin and yMax
What is the value for k?    Is it known and constant?  Does the order of the k restaurants matter?

With significant preprocessing, you could make a map (like ozo's Voroni diagram) so that each region of the map is associated with the same k restaurants.  This would reduce a customer request to a look-up table.
in 2-d, you can build a k-nearest neighbor Voronoi diagram in O(k^2 N log N) time, and search it in O(k+log N) time
You can exclude the longitude and latitude based on distance so say you are at x and you only want items within x+50 yards.  You can add that distance to your longitude and latitude in all directions and exclude all others that fall out of the range.

So it would be anywhere where x+50 and x-50, same for y.  Then I'm not sure how your data is laid out, but you would search any items that falls within x first then y or both, depending if it is a sql server.

Something like that?
Rohit BajajAuthor Commented:
HI ozo
in 2-d, you can build a k-nearest neighbor Voronoi diagram in O(k^2 N log N) time, and search it in O(k+log N) time ?

Can you give me a reference where its explained to build a k-nearest neighbor voronoi diagram ?
I am not familiar with voronoi diagrams.
No its a algorithm question i found one some site.
I would say, by that the author has contradicted that the question is of academic type.

if I am right, I would like to post the description of an algorithm based on a suggestion d-glitch has made.

I added a link to the question so it is properly attributed in compliance with site rules.

I did so somewhat selfishly, because I think this is an interesting problem, and am looking forward to a good discussion. This kind of problem comes up very frequently on web pages, because a customer always wants to know where are the closest places to buy a product. Since I am frequently asked to choose the state or territory I live in, I have concluded that most web developers are not using a very good way of solving the problem.

That said, I looked up Voronoi diagrams as mentioned by ozo and find them to be an interesting approach. I don't know how to create one, but if one were available it would definitively answer the question of where the closest restaurant is located. You would also want to have available a list of Voronoi territories that are contiguous with each one of the restaurants.

After doing some more thinking about Voronoi diagrams, I realized that the kth closest restaurant must be in a Voronoi territory contiguous with one or more of the k-1 Voronoi territories previously found. This realization greatly reduces the solution space, as each new solution will lie on a spiral of Voronoi territories surrounding the first one. The proof for my assertion involves drawing a line from the customer to a restaurant in a Voronoi territory that is not contiguous with previous solutions. It can be shown that such a line will always pass through a Voronoi territory that is contiguous with previous solutions--and that the restaurant in the Voronoi territory so traversed will always be closer to the customer than the one in the non-contiguous territory.

When you find the k-1 th closest Voronoi territory, you add its contiguous Voronoi territories to the solutions space. Since there are only a handful of such territories, you can calculate the distance between each newly added restaurant and the customer. These distances will likely sort as more distant from the customer than the previously identified contiguous Voronoi territories, so it will be pretty easy to identify the kth closest one in a previously sorted list.

The diameter of the solution cloud is proportional to the square root of the number of previously identified solutions. The solution space for a new territory is proportional to the circumference of that cloud, so the solution space increases in size as the square root of the number of previous solutions.

All that said, I am now going to be quiet in hopes of learning something from ozo, d-glitch, sarabande and others, who clearly are approaching the problem from a higher level.
unfortunately I don't have experiences with voronoi diagrams. I am also not sure whether those territories could help to find the exact solution as the territories are different in size and shape.

but, using a grid instead of Voronoi as suggested by d-glitch easily could be calculated and most of your thoughts can be applied with a grid as well:

- find out min and max horizontal and vertical coordinates of the restaurants.
- divide the rectangle spanned by (xmin,ymin), (xmin,ymax), (xmax,ymax), (xmax,ymin) into a nxm grid (like it is on a chessboard with cells labeled A1, ..., H8 or on a map). the number of rows and columns might depend on the number of restaurants, but you also could start with 8x8 to make it simple). if the distribution of the restaurants on the map is not evenly distributed you may increase the number of cells such that the maximum number of restaurants per cell is below a given threshold (for example less than 2k).

- determine for each restaurant which cell it belongs to and create a list of restaurants for each cell (for example by using an array of lists).
- determine a circle around the customer. you may start with a radius that is 1.5 times as long as the diagonal of a cell.
- determine which cells would intersect with the circle.
- calculate the total number r of restaurants of these cells which are included in the circle as well.
- check whether r is between k and 2k. if it is less, you would increase the radius of the customer's circle  (for example by 1/4) until r is within the range (k, 2k), if it is greater, you would decrease the radius accordingly.
the restaurants of these cells would be a superset of the restaurants you look for.

- you may now order the restaurants of the found cells by their absolute distance to the customer's location to determine the final solution.

if the map is small or the number of restaurants is small (say less than 100), the time needed for the "brute force" approach would be not measurable on modern hardware.

some more considerations:

the linear distance between customer and restaurant rarely will the best measure for a customer who looks for the nearest places for restaurants. travel times probably would be much better. however it surely is not easy to get reliable travel times between two arbitrary points on the map at runtime. a possible method for a known map is to using a fixed number of nodal points like junctions where you have a complete table of the travel distances between each pair. the problem then could be modified to find first the nearest nodes to the customer (using linear distance), then find other near nodes by using the travel time table, and finally check the restaurants near to any of these nodes (using linear distance to the node which it is nearest to). from a programmer's point of view I would say the best method to solve the problem is to using a tree of (node) points and (leaf) points where the customer is the root node and restaurants are the leafs and the nodes are chosen such that the distance from root to leaf is minimal.

There are many grid methods that can be used for k-nearest problems,
such as quad-trees, R-trees, cover trees, etc...
and for each of them there are various ways to pre-process the tree to optimize searching it.
Any of which could make an interesting discussion, but with such an overwhelming number of possibilities
it seems difficult to recommend one over another.
So I'd be inclined to favor a method that solves the problem directly, with a well defined optimal structure,
which is why I suggested Voronoi diagrams.

I'd suggest looking up
Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases
Mohammad Kolahdouzan and Cyrus Shahabi, 2004
On k-Nearest Neighbor Voronoi Diagrams in the Plane
Der-Tsai Lee, 1982

These can also be used with different distance metrics other than Pythagorean distance.
It could get complicated to try to account for local road travel times under different traffic conditions,
but you did say we are allowed to do an significant amount of pre-processing.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.