Solved

SQL - Populate Column with Closest Lat/Lng From another Table

Posted on 2008-10-14
12
754 Views
Last Modified: 2010-04-21
I have a SQL Server (2000) table containing 120,000 addresses, each with latitude/longitude co-ordinates. I have several other tables for different places of interest - as an example, I'll use museums. Each record in museums also has latitude/longitude co-ordinates. Every night, I want to populate a column in the addresses table with the name of the closest museum by distance, as the crow flies. I'll be doing this with at least 10 columns so I need a fairly efficient method. I am stumped about the best way to implement this, but supposing it were as a user-defined function, it would look something like this:

UPDATE Addresses SET NearestMuseum = GetNearest('MuseumsTable', 'MuseumNameField', LatColumn, LngColumn)

Where it would be possible to use one user defined function that could accept the name of the table you want to query, the field you want to return and the Lat/Lng you want to find the nearest record to.

I have triggers set up on all of the tables to create columns ProximityX, ProximityY and ProximityZ, the purpose of which is to pre-calculate trigonometric functions so that spatial queries can be faster at runtime. I have a stored procedure that gets the closest n records to a given lat/lng which uses them. The code sample is an example of one of these triggers.

I need a good solution to this problem and quickly, so I'm assigning the maximum points I have at my disposal.
CREATE TRIGGER [ProximityColumnsUpdate_Addresses] ON GazetteerUser.Properties

FOR INSERT, UPDATE

AS
 

UPDATE Addresses

    SET ProximityX = (COS(RADIANS(i.Latitude)) * COS(RADIANS(i.Longitude))),

        ProximityY = (COS(RADIANS(i.Latitude)) * SIN(RADIANS(i.Longitude))),

        ProximityZ = (SIN(RADIANS(i.Latitude)))

    FROM inserted i LEFT JOIN Addresses a ON i.id = p.id

Open in new window

0
Comment
Question by:wwarby
  • 7
  • 5
12 Comments
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
do you plan to include a "limit" to the "closest"?
I mean, when you want to say "max=100km/miles", then you can add, on the update limit on the brute difference of Lat and Long.

now:
>UPDATE Addresses SET NearestMuseum = GetNearest('MuseumsTable', 'MuseumNameField', LatColumn, LngColumn)

means it would be some dynamic sql, which is not possible in functions.
so, you will need to create 1 function per source table...
0
 
LVL 1

Author Comment

by:wwarby
Comment Utility
Angellll,

There are two scenarios for the columns I need to populate and I have only articulated the type for which I don't care about the radius limit. The other type is:

UPDATE Addresses SET NearbyStreetworksCount = CountWithinRadius('StreetworksTable', LatColumn, LngColumn, 1 [mile])

where I want the field in the addresses to be populated with the count of records from the streetworks table that fall within a specified search radius. I figured if somebody could help me solve the first problem I'd be able to solve the second on my own, but judging by what you've said, this second scenario makes a difference to the overall solution.

If necessary I can live with a one function per source table solution although obviously I'd rather avoid that scenario. The other way I thought could work was with a stored procedure that loops through every record in the addresses table and does it's business.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
the "distance" is something to care about when speaking performance.

that say, you are not going yet the sql 2008 way?
that version has the geometry data types built-in (including indexes on geometry data types), and your query would be a plain "1-liner" ...

so, as this is a new development, you should go that way, actually.
there has been plenty of efforts made for that data type, you should use it (instead of reinventing the wheel) ...
0
 
LVL 1

Author Comment

by:wwarby
Comment Utility
Angellll,

I could provide a distance if required but the datasets are all confined to a London borough of about 40 square miles and in most cases I'm talking very small data sets. We have 2 museums, 13 libraries about 20 parks... you get the idea. The number of addresses is the problem - 120,000 of them.

There's a good reason why we're not going the 2008 route just yet. Corporately, we use MapInfo for GIS, and MapInfo Professional 9.5 only supports SQL Server 2008's new geometry data types in read-only mode. Since we have to have users maintaining this information in a desktop GIS, we can't go there yet. Also, I have to go live with this project in November. There's no way I could upgrade our main SQL Server in time.

I can either find a solution in T-SQL or I can do it with MapInfo MapXtreme 2008 which I am using for other purposes in my data aggregation program that runs on a nightly schedule. I've spent about 16 hours flipping between both approaches so far and come up completely empty handed.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
ok. the syntax would be this:
UPDATE Addresses 

   SET NearestMuseum = ( SELECT TOP 1 m.MuseumNameField

               FROM MuseumsTable m

               ORDER BY <expression_for_distance> DESC

         )

Open in new window

0
 
LVL 1

Author Comment

by:wwarby
Comment Utility
Angellll,

Thanks, but I've kind of got that far on my own. I tried something very similar to this but couldn't work out exactly what to put in <expression_for_distance>. I'll have another go at working out <expression_for_distance> this evening.
0
IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
so, you just need the formula for the distance?
see here:
http://forums.microsoft.com/TechNet/ShowPost.aspx?PostID=2671587&SiteID=17

create function dbo.Distance( @lat1 float , @long1 float , @lat2 float , @long2 float)

returns float
 

as
 

begin
 

declare @DegToRad as float

declare @Ans as float

declare @Miles as float
 

set @DegToRad = 57.29577951

set @Ans = 0

set @Miles = 0
 

if @lat1 is null or @lat1 = 0 or @long1 is null or @long1 = 0 or @lat2 is

null or @lat2 = 0 or @long2 is null or @long2 = 0
 

begin
 

return ( @Miles )
 

end
 

set @Ans = SIN(@lat1 / @DegToRad) * SIN(@lat2 / @DegToRad) + COS(@lat1 / @DegToRad ) * COS( @lat2 / @DegToRad ) * COS(ABS(@long2 - @long1 )/@DegToRad)
 

set @Miles = 3959 * ATAN(SQRT(1 - SQUARE(@Ans)) / @Ans)
 

set @Miles = CEILING(@Miles)
 

return ( @Miles )
 

end

Open in new window

0
 
LVL 142

Accepted Solution

by:
Guy Hengel [angelIII / a3] earned 500 total points
Comment Utility
and the UPDATE becomes:
UPDATE a

   SET NearestMuseum = ( SELECT TOP 1 m.MuseumNameField

               FROM MuseumsTable m

               ORDER BY dbo.Distance( a.latitude, a.longitude, m.latitude , m.longitude ) DESC

         )

FROM Addresses a

Open in new window

0
 
LVL 1

Author Comment

by:wwarby
Comment Utility
Angellll,

Thanks very much for your efforts so far. I've implemented the code you provided and on the face of it, everything seemed to work. However, something isn't right. I have 120,000 addresses that cover an area of some 40 square miles and dotted around inside that area are 13 libraries. I ran the code against my libraries table to populate a "NearestLibrary" field but when I then do "SELECT DISTINCT NearestLibrary FROM Addresses", there are only three records. There should be 13 because every library is surrounded by residential properties.

I need to work out where the error is which I will do tomorrow because it's getting late in England. I'll use MapInfo Professional to check the locations of my Lat/Lng values in both tables and perform some spot checks on the distance calculations. I'll get back to you with the results of these tests.
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
run this query to see the output of the distances:
SELECT a.*, m.*, dbo.Distance( a.latitude, a.longitude, m.latitude , m.longitude ) distance

FROM MuseumsTable m

CROSS JOIN Addresses a

ORDER BY m.MuseumNameField

Open in new window

0
 
LVL 1

Author Closing Comment

by:wwarby
Comment Utility
I figured it out. the DESC keyword in your ORDER BY clause shouldn't have been there. I was finding the FURTHEST rather than the NEAREST library in each case. Once I removed that keyword it worked flawlessly. Incidentally, when run against 13 libraries on 120,000 properties it takes less than 90 seconds. Another approach I was working on using MapXtreme was running into 6 hours!

Thanks ever so much for your help with this - you've got me out of a real bind!
0
 
LVL 142

Expert Comment

by:Guy Hengel [angelIII / a3]
Comment Utility
Grading Comments:
I figured it out. the DESC keyword in your ORDER BY clause shouldn't have been there. I was finding the FURTHEST rather than the NEAREST library in each case. Once I removed that keyword it worked flawlessly. Incidentally, when run against 13 libraries on 120,000 properties it takes less than 90 seconds. Another approach I was working on using MapXtreme was running into 6 hours!

Thanks ever so much for your help with this - you've got me out of a real bind!



glad I could help. sorry for the DESC :...
0

Featured Post

What Security Threats Are You Missing?

Enhance your security with threat intelligence from the web. Get trending threat insights on hackers, exploits, and suspicious IP addresses delivered to your inbox with our free Cyber Daily.

Join & Write a Comment

Performance is the key factor for any successful data integration project, knowing the type of transformation that you’re using is the first step on optimizing the SSIS flow performance, by utilizing the correct transformation or the design alternat…
When you hear the word proxy, you may become apprehensive. This article will help you to understand Proxy and when it is useful. Let's talk Proxy for SQL Server. (Not in terms of Internet access.) Typically, you'll run into this type of problem w…
This video shows how to set up a shell script to accept a positional parameter when called, pass that to a SQL script, accept the output from the statement back and then manipulate it in the Shell.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function

762 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now