Calculating distance using lat long

We have nearly 30,000 records in our table, each record has associated lat long columns.

We run a query to return the top 300 records which are nearest to a specfied lat long.

The function we use is fn_Haversine which is attached

With more and more records the query is becoming slower and slower, ideally we would like to be able to metaphorically speaking, draw a box around a group of records which are outside of say 100miles.
In other words we don't want to pass every single record (30,000) through the haversine function just so we can select the top 300 closest.

What can we do?

Help much appreciated
ALTER function [dbo].[udf_Haversine](@lat1 float, @long1 float,  
                   @lat2 float, @long2 float) 
RETURNS FLOAT 
BEGIN 
    
	DECLARE @dlon float, @dlat float, @rlat1 float,  
                 @rlat2 float, @rlong1 float, @rlong2 float,  
                 @a float, @c float, @R float, @d float, @DtoR float 
 
IF @lat1 IS NULL OR @long1 IS NULL 
BEGIN
	SET @d = 30000
	
END
ELSE BEGIN

    select @DtoR = 0.017453293 
    select @R = 3959      -- Earth radius 
 
    select  
        @rlat1 = @lat1 * @DtoR, 
        @rlong1 = @long1 * @DtoR, 
        @rlat2 = @lat2 * @DtoR, 
        @rlong2 = @long2 * @DtoR 
 
    select  
        @dlon = @rlong1 - @rlong2, 
        @dlat = @rlat1 - @rlat2 
 
    select @a = power(sin(@dlat/2), 2) + cos(@rlat1) *  
                     cos(@rlat2) * power(sin(@dlon/2), 2) 
    select @c = 2 * atn2(sqrt(@a), sqrt(1-@a)) 
    select @d = @R * @c 
 END
    return @d  


end

Open in new window

LVL 9
Type25Asked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Guy Hengel [angelIII / a3]Billing EngineerCommented:
you filter the records first based on the lat +- max distance and on long +- max distance.


declare @max_distance float
declare @source_lat float
declare @source_long float

set @max_distance = 100
set @source_lat = 50
set @source_long = 140


select *
  from yourtable
 where lat >= @source_lat - @max_distance
   and lat <= @source_lat + @max_distance
   and long >= @source_long - @max_distance
   and long <= @source_long + @max_distance
   and [dbo].[udf_Haversine](lat, long, @source_lat, @source_long) <= @max_distance
order by [dbo].[udf_Haversine](lat, long, @source_lat, @source_long)  desc

Open in new window

Type25Author Commented:
Makes sense but once i put those changes in it just hangs...

Our full query is attached

--Set the weights and heights and strip off the extra data sent by the user--
	SET @minDateOfBirth = dateadd(yy, -CAST(@minAge as INT), getdate())
	SET	@maxDateOfBirth = dateadd(yy, -CAST(@maxAge as INT), getdate())
	SET @minimumWeight = SUBSTRING(@minWeight, 0, CHARINDEX(' ', @minWeight, 0)) 
	SET @maximumWeight = SUBSTRING(@maxWeight, 0, CHARINDEX(' ', @maxWeight, 0)) 
	SET @minimumHeight = SUBSTRING(@minHeight, 0, CHARINDEX(' ', @minHeight, 0)) 
	SET @maximumHeight = SUBSTRING(@maxHeight, 0, CHARINDEX(' ', @maxHeight, 0))

	SELECT @ProfileID = ProfileID, @lat = ISNULL(latitude,'0'), @long = ISNULL(longitude,'0') FROM Profiles WHERE Email = @MyUsername OR Username = @MyUsername
	UPDATE Profiles SET [Online] = 1 WHERE Username = @MyUsername
	
	IF @Latitude <> '' BEGIN
		SET @lat = @Latitude
		SET @long = @Longitude
	END

	declare @max_distance int
	set @max_distance = 200
	

SELECT pp.ProfileID, pp.FirstName,pp.LastName,pp.Username,
		CASE WHEN pp.DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE pp.DOB END DOB,
	    pp.Email, pp.HomeTown, pp.AboutMe, pp.Interests, pp.Status,pp.MainProfilePicPath, 
		CASE WHEN pp.Weight IS NOT NULL THEN CAST(pp.Weight as varchar) + ' kg'  ELSE NULL END Weight, 
		CASE WHEN pp.Height IS NOT NULL THEN CAST(pp.Height as varchar) + ' cm'  ELSE NULL END Height,
		pp.DiscloseDOB, pp.DiscloseWeight, pp.DiscloseHeight,
		pp.Ethnicity, dbo.udf_Haversine(pp.latitude, pp.longitude, @lat, @long) Distance , FLOOR( DATEDIFF(DAY, pp.DOB, getdate()) / 365.25) Age, 
		pp.Online, 'friend' Friend,
		pp.Longitude, pp.Latitude, ISNULL(pp.ShowOnMap,1) ShowOnMap,
		dbo.ReturnPicturePaths(pp.ProfileID) Pictures,
		dbo.LastOnlineStr(DATEDIFF(SECOND,pp.LastOnline,GETDATE())) LastOnline, ISNULL(pp.VIP,0) VIP
		
		FROM FriendsList f INNER JOIN Profiles p ON f.ProfileID = p.ProfileID JOIN
		Profiles pp ON f.FriendProfileID = pp.ProfileID
		WHERE p.ProfileID = @ProfileID AND @Username = ''
		
UNION ALL 

SELECT DISTINCT  ProfileID, FirstName,LastName,Username,
		CASE WHEN DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE DOB END DOB,
	  Email, HomeTown, AboutMe, Interests, Status,MainProfilePicPath, 
		CASE WHEN Weight IS NOT NULL THEN CAST(Weight as varchar) + ' kg' ELSE NULL END Weight, 
		CASE WHEN Height IS NOT NULL THEN CAST(Height as varchar) + ' cm' ELSE NULL END Height,
		DiscloseDOB, DiscloseWeight, DiscloseHeight,
		Ethnicity, Distance, Age, Online, 'no friend' as Friend, Longitude, Latitude, ShowOnMap,
		dbo.ReturnPicturePaths(ProfileID) Pictures,
		dbo.LastOnlineStr(DATEDIFF(SECOND,LastOnline,GETDATE())) LastOnline, VIP
FROM
(
	
SELECT TOP 300 p.ProfileID, FirstName,LastName,Username,
		CASE WHEN DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE DOB END DOB,
	    Email, HomeTown, AboutMe, Interests, Status,MainProfilePicPath, 
		CASE WHEN Weight IS NOT NULL THEN CAST(Weight as varchar)  ELSE NULL END Weight, 
		CASE WHEN Height IS NOT NULL THEN CAST(Height as varchar)  ELSE NULL END Height,
		DiscloseDOB, DiscloseWeight, DiscloseHeight,
		Ethnicity, dbo.udf_Haversine(latitude, longitude, @lat, @long) Distance , FLOOR( DATEDIFF(DAY, DOB, getdate()) / 365.25) Age, 
		Online, LastOnline,
		p.Longitude, p.Latitude, ISNULL(ShowOnMap,1) ShowOnMap, ISNULL(VIP,0) VIP
		
	FROM Profiles p
		 LEFT OUTER JOIN BlockUserList b ON b.BlockProfileID = p.ProfileID AND b.ProfileID = @ProfileID 
		 LEFT OUTER JOIN ProfilePictures pp ON p.ProfileID = pp.ProfileID
	WHERE 
		Latitude >= @lat - @max_distance
		and Latitude <= @lat + @max_distance
		and Longitude >= @long - @max_distance
		and Longitude <= @long + @max_distance AND
		dbo.udf_Haversine(latitude, longitude, @lat, @long) <= @max_distance AND
	    (@Keywords = '' OR (AboutMe LIKE '%' + @Keywords + '%' OR
               Interests LIKE '%' + @Keywords + '%')) AND 
	    (@ethnicity = 'Any' OR ethnicity = @ethnicity) AND 
		(@ShowOnline = 0 OR [Online] = 1) AND
		(@Username = '' OR Username = @Username) AND
		(@name = '' OR FirstName + ' ' + Lastname = @name) AND
		(@HomeTown = '' OR HomeTown = @HomeTown) AND
		(@email = '' OR Email = @email) AND
		(DOB IS NULL OR DATEDIFF(yy, ISNULL(DOB,DATEADD(yy,-21,GETDATE())), GETDATE()) BETWEEN  Cast(@minAge as int) and cast(@maxAge as int)) AND
		((@minimumWeight <= 40 AND @maximumWeight >= 200) OR (ISNULL(Weight,41) >= @minimumWeight AND ISNULL(Weight,41) <= @maximumWeight)) AND
		((@minimumHeight <= 99 AND @maximumHeight >= 250) OR (ISNULL(Height,251) >= @minimumHeight AND ISNULL(Height,251) <= @maximumHeight)) 
	    AND  b.BlockProfileID IS NULL AND p.Username != 'admin'
	    AND (@ShowPicsOnly = 0 OR pp.ProfileID IS NOT NULL)
	    AND p.ProfileID NOT IN (SELECT FriendProfileID FROM FriendsList WHERE ProfileID = @ProfileID AND @Username = '')
	    ORDER BY Distance ASC
		
) a
WHERE (a.Distance <= @Radius)
ORDER BY Friend, Distance ASC, [Online] desc, LastOnline DESC

Open in new window

Shaun KlineLead Software EngineerCommented:
I suggest doing a subquery against your Profiles table (line 61) to remove any non-matches before you do the left outer joins. This should eliminate some of the records prior to doing the distance calculation.

Also, since distance would be measured in miles/km/feet, it makes little sense to do the "Latitude >= @lat - @max_distance", as latitude should be in degrees.
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Guy Hengel [angelIII / a3]Billing EngineerCommented:
only a explain plan shall show the issues.
also, starting to "debug" with such a large query is usually problematic...

try to start smaller, to see as from which point the query goes "bad"
Type25Author Commented:
angellll - it runs instantly if we remove the haversine function.

So it's clearly the distance part which is taking the time to compute.
cyberkiwiCommented:
Hi there,

I notice that you have SQL Server 2005 and 2008 - are you using a db that needs to run on both?  On 2008 (not 2005), you can switch to the geospatial types which will allow you to index by lat-lng which will pretty much solve your problem once you start using the spatial queries.

http://www.microsoft.com/sqlserver/2008/en/us/spatial-data.aspx
"Build high-performance solutions with spatial data indexing"
Type25Author Commented:
Hi Shaun, it's a good idea but the default search is pretty much everyone ordered by top 300 by distance.

As you can see all the WHERE clauses are optional parameters and by default these are generally never filled out so it wouldn't eliminate many records.

How could we be a bit more clever with limiting the records by using some clever math on lat/long? If we limited it by country it would be good enough
Type25Author Commented:
Not actually heard of that before cyberkiwi, will have to check that out as we're using 2008
cyberkiwiCommented:
>> clever math on lat/long?
You would have to be very clever indeed because the dynamics of lat changes and lon changes as you move nearer the poles or equator.
Near the equator, 1 lon covers a huge distance but at the pole, 1 lon is nothing but 1 lat is still as far as it always is.

Here's a function I wrote using some free pre-referenced lat by lon values.
I hid the function and did not want to show my preset table, but you'd be best to store it in a table permanently.
Even if you continue to use the haversine method, see if this is faster - maybe it can help pre-filter to 3x the candidates required for the actual haversine calculations and top.
CREATE function [dbo].[fod] (
 @lat1 float,
 @lon1 float,
 @lat2 float,
 @lon2 float
) returns float as
begin

declare @ref table (latitude int, latdelta float, londelta float)
insert into @ref values (-90,111694.2,0)
insert into @ref values (-89,111693.599999999,1949.4)
insert into @ref values (-88,111692.399999999,3898.19999999999)
insert into @ref values (-87,111690.6,5845.8)
insert into @ref values (-86,111688.2,7791)
insert into @ref values (-85,111685.2,9734.4)
insert into @ref values (-84,111681.599999999,11674.8)
insert into @ref values (-83,111677.399999999,13611.6)
insert into @ref values (-82,111672,15543.6)
insert into @ref values (-81,111666.599999999,17471.4)
insert into @ref values (-80,111660,19393.2)
insert into @ref values (-79,111652.8,21309.6)
insert into @ref values (-78,111645,23219.4)
insert into @ref values (-77,111636.599999999,25121.4)
insert into @ref values (-76,111628.2,27015.5999999999)
insert into @ref values (-75,111618.599999999,28902)
insert into @ref values (-74,111608.4,30779.4)
insert into @ref values (-73,111597.6,32646.6)
insert into @ref values (-72,111586.2,34504.2)
insert into @ref values (-71,111574.2,36351)
insert into @ref values (-70,111562.2,38186.4)
insert into @ref values (-69,111549,40010.4)
insert into @ref values (-68,111535.8,41821.7999999999)
insert into @ref values (-67,111522,43620)
insert into @ref values (-66,111507.6,45405)
insert into @ref values (-65,111492.6,47175.5999999999)
insert into @ref values (-64,111477.6,48931.7999999999)
insert into @ref values (-63,111461.4,50673)
insert into @ref values (-62,111445.8,52398)
insert into @ref values (-61,111429,54107.3999999999)
insert into @ref values (-60,111412.2,55800)
insert into @ref values (-59,111395.399999999,57475.1999999999)
insert into @ref values (-58,111377.399999999,59133)
insert into @ref values (-57,111360,60772.1999999999)
insert into @ref values (-56,111342,62392.8)
insert into @ref values (-55,111323.4,63994.1999999999)
insert into @ref values (-54,111304.799999999,65575.8)
insert into @ref values (-53,111286.2,67137)
insert into @ref values (-52,111267.6,68677.8)
insert into @ref values (-51,111248.4,70197.6)
insert into @ref values (-50,111229.2,71695.8)
insert into @ref values (-49,111210,73171.8)
insert into @ref values (-48,111190.2,74625.6)
insert into @ref values (-47,111171,76056)
insert into @ref values (-46,111151.2,77463)
insert into @ref values (-45,111132,78846.5999999999)
insert into @ref values (-44,111112.2,80206.1999999999)
insert into @ref values (-43,111093,81541.1999999999)
insert into @ref values (-42,111073.2,82851)
insert into @ref values (-41,111054,84135)
insert into @ref values (-40,111034.799999999,85393.8)
insert into @ref values (-39,111015.6,86626.1999999999)
insert into @ref values (-38,110996.4,87832.1999999999)
insert into @ref values (-37,110977.8,89011.8)
insert into @ref values (-36,110959.2,90163.8)
insert into @ref values (-35,110940.6,91288.1999999999)
insert into @ref values (-34,110922.6,92385)
insert into @ref values (-33,110904.6,93453)
insert into @ref values (-32,110886.599999999,94493.4)
insert into @ref values (-31,110869.2,95504.3999999999)
insert into @ref values (-30,110852.399999999,96486)
insert into @ref values (-29,110835.6,97438.8)
insert into @ref values (-28,110819.399999999,98361.5999999999)
insert into @ref values (-27,110803.8,99255)
insert into @ref values (-26,110788.2,100117.8)
insert into @ref values (-25,110772.6,100950)
insert into @ref values (-24,110758.2,101751.599999999)
insert into @ref values (-23,110743.8,102522.6)
insert into @ref values (-22,110730,103262.399999999)
insert into @ref values (-21,110716.8,103970.399999999)
insert into @ref values (-20,110704.2,104647.2)
insert into @ref values (-19,110692.2,105292.2)
insert into @ref values (-18,110680.2,105904.799999999)
insert into @ref values (-17,110669.399999999,106485.6)
insert into @ref values (-16,110658.599999999,107034.6)
insert into @ref values (-15,110648.4,107550.6)
insert into @ref values (-14,110639.399999999,108034.2)
insert into @ref values (-13,110630.399999999,108484.799999999)
insert into @ref values (-12,110622,108902.399999999)
insert into @ref values (-11,110614.799999999,109287.6)
insert into @ref values (-10,110607.6,109639.2)
insert into @ref values (-9,110601.599999999,109957.8)
insert into @ref values (-8,110595.6,110243.4)
insert into @ref values (-7,110590.8,110495.399999999)
insert into @ref values (-6,110586.599999999,110713.8)
insert into @ref values (-5,110583,110898.599999999)
insert into @ref values (-4,110579.399999999,111050.399999999)
insert into @ref values (-3,110577.6,111168)
insert into @ref values (-2,110575.8,111252)
insert into @ref values (-1,110574.6,111302.399999999)
insert into @ref values (0,110574,111319.2)
insert into @ref
select -latitude, latdelta, londelta
from @ref where latitude <> 0

declare @distance float
-- make it easy, swap so that x1 <= x2 and y1 <= y2
-- note: @distance is used here as swap var only, no significance
if @lon1 > @lon2 select @distance = @lon2, @lon2 = @lon1, @lon1 = @distance
if @lat1 > @lat2 select @distance = @lat2, @lat2 = @lat1, @lat1 = @distance

SELECT @distance = Sqrt(x * x + y * y)
FROM   (SELECT SUM(weight * latdelta) AS x,
	case sum(weight)
	when 0.0 then sum(londelta * (@lon2 - @lon1))
	else SUM(weight * londelta * (@lon2 - @lon1))/SUM(weight)
	end AS y
        FROM   (SELECT CASE
                         WHEN wgt1 <= 1
                              AND wgt2 <= 1 THEN (wgt1 + wgt2 - 1)
                         WHEN wgt1 <= 1 THEN wgt1
                         WHEN wgt2 <= 1 THEN wgt2
                         ELSE 1
                       END AS weight,
                       latdelta,
                       londelta
                FROM   (SELECT lat1.latitude - @lat1 + 0.5 AS wgt1,
                               @lat2 + 0.5 - lat2.latitude AS wgt2,
                               lat1.latdelta,
                               lat1.londelta
                        FROM   @ref lat1
                               INNER JOIN @ref lat2
                                 ON lat1.latitude = lat2.latitude
                        WHERE  lat1.latitude > @lat1 - 0.5
                               AND lat2.latitude < @lat2 + 0.5) RAW) weighted) xydistances
return @distance
end

Open in new window

Shaun KlineLead Software EngineerCommented:
It is true that you may not remove many rows from your data, but you may be running the udf_Haversine twice, once in the select clause and once in the where clause. If you use a subquery and include the calculation as a field, you can then limit the distances in the outer query.
Type25Author Commented:
Hey Shaun,

But we don't run put it in the where clause.

dportasCommented:
Take a look at the STDistance method in Books Online. It will do the calculation for you using the Geography type.

Eg:

DECLARE @g geography;
DECLARE @h geography;
SET @g = geography::STGeomFromText('LINESTRING(-122.360 47.656, -122.343 47.656)', 4326);
SET @h = geography::STGeomFromText('POINT(-122.34900 47.65100)', 4326);
SELECT @g.STDistance(@h);

(untested code)
Type25Author Commented:
That made sense.... :)

I meant to say we don't put the haversine function within the where clause
Type25Author Commented:
Ok, getting there then...

DECLARE @KMperNM float = 1.0/1.852;
DECLARE @nwi geography = geography::STGeomFromText('POINT(-1.2167 52.6164)', 4326);
DECLARE @edi geography = geography::STGeomFromText('POINT(-1.1306 52.6299)', 4326);
SELECT (@nwi.STDistance(@edi)/1000.0) * @KMperNM;

This returns a correct value, however how would that fit into an inline query, as above.

Help
dportasCommented:
You would need to make the column a Geography type and then you can use it in your queries:

SELECT Col1.STDistance(Col2)
FROM ...
Type25Author Commented:
Do the lat long fields have to be a geography type?

If so will cause any issues if we convert them from varchar to geography?
cyberkiwiCommented:
Create a new column in the table of Geography type, populate it from lat/lng, then create a spatial index over it.
Then, finally, modify your query.
cyberkiwiCommented:
Hi,

This should cut down the size considerably, assuming your data is spread out wide

>>which are outside of say 100miles.

You have  @max_distance int = 200, which means 200 lat degrees... quite useless

            Latitude >= @lat - @max_distance
            and Latitude <= @lat + @max_distance
            and Longitude >= @long - @max_distance
            and Longitude <= @long + @max_distance AND

But if you change that to

      FROM Profiles p
                 INNER JOIN ref on ref.Latitude = -round(@Lat,0)
then
-- 1.6 is for miles to kms, or adjust latdelta and londelta in table ref
            Latitude >= @lat - ref.latdelta * @max_distance * 1.6 * 1.05 (+5% allowance)
            and Latitude <= @lat + ref.latdelta * @max_distance * 1.6 * 1.05
            and Longitude >= @long - ref.londelta * @max_distance * 1.6 * 1.05
            and Longitude <= @long + ref.londelta * @max_distance * 1.6 * 1.05 AND

You will considerably cut down on the records required to evaluate haversine/spatial query.
If you are good with maths, you can tune it further using

(lat-@lat)*(lon-@lon)    against    @max_distance*@max_distance

I believe SQL 2008 spatial types uses pre-filters internally, but if it doesn't, adding this still wouldn't hurt.
create table ref (latitude int, latdelta float, londelta float)
insert into ref values (-90,111694.2,0)
insert into ref values (-89,111693.599999999,1949.4)
etc (data in comment far above)

Open in new window

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Type25Author Commented:
Fantastic stuff cyberkiwi, will give that a go and get back to you.

Thanks again
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.