# Calculating distance using lat long

We have nearly 30,000 records in our table, each record has associated lat long columns.

We run a query to return the top 300 records which are nearest to a specfied lat long.

The function we use is fn_Haversine which is attached

With more and more records the query is becoming slower and slower, ideally we would like to be able to metaphorically speaking, draw a box around a group of records which are outside of say 100miles.
In other words we don't want to pass every single record (30,000) through the haversine function just so we can select the top 300 closest.

What can we do?

Help much appreciated
``````ALTER function [dbo].[udf_Haversine](@lat1 float, @long1 float,
@lat2 float, @long2 float)
RETURNS FLOAT
BEGIN

DECLARE @dlon float, @dlat float, @rlat1 float,
@rlat2 float, @rlong1 float, @rlong2 float,
@a float, @c float, @R float, @d float, @DtoR float

IF @lat1 IS NULL OR @long1 IS NULL
BEGIN
SET @d = 30000

END
ELSE BEGIN

select @DtoR = 0.017453293
select @R = 3959      -- Earth radius

select
@rlat1 = @lat1 * @DtoR,
@rlong1 = @long1 * @DtoR,
@rlat2 = @lat2 * @DtoR,
@rlong2 = @long2 * @DtoR

select
@dlon = @rlong1 - @rlong2,
@dlat = @rlat1 - @rlat2

select @a = power(sin(@dlat/2), 2) + cos(@rlat1) *
cos(@rlat2) * power(sin(@dlon/2), 2)
select @c = 2 * atn2(sqrt(@a), sqrt(1-@a))
select @d = @R * @c
END
return @d

end
``````
LVL 9
###### Who is Participating?

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Billing EngineerCommented:
you filter the records first based on the lat +- max distance and on long +- max distance.

``````declare @max_distance float
declare @source_lat float
declare @source_long float

set @max_distance = 100
set @source_lat = 50
set @source_long = 140

select *
from yourtable
where lat >= @source_lat - @max_distance
and lat <= @source_lat + @max_distance
and long >= @source_long - @max_distance
and long <= @source_long + @max_distance
and [dbo].[udf_Haversine](lat, long, @source_lat, @source_long) <= @max_distance
order by [dbo].[udf_Haversine](lat, long, @source_lat, @source_long)  desc
``````
Author Commented:
Makes sense but once i put those changes in it just hangs...

Our full query is attached

``````--Set the weights and heights and strip off the extra data sent by the user--
SET @minDateOfBirth = dateadd(yy, -CAST(@minAge as INT), getdate())
SET	@maxDateOfBirth = dateadd(yy, -CAST(@maxAge as INT), getdate())
SET @minimumWeight = SUBSTRING(@minWeight, 0, CHARINDEX(' ', @minWeight, 0))
SET @maximumWeight = SUBSTRING(@maxWeight, 0, CHARINDEX(' ', @maxWeight, 0))
SET @minimumHeight = SUBSTRING(@minHeight, 0, CHARINDEX(' ', @minHeight, 0))
SET @maximumHeight = SUBSTRING(@maxHeight, 0, CHARINDEX(' ', @maxHeight, 0))

SELECT @ProfileID = ProfileID, @lat = ISNULL(latitude,'0'), @long = ISNULL(longitude,'0') FROM Profiles WHERE Email = @MyUsername OR Username = @MyUsername

IF @Latitude <> '' BEGIN
SET @lat = @Latitude
SET @long = @Longitude
END

declare @max_distance int
set @max_distance = 200

CASE WHEN pp.DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE pp.DOB END DOB,
CASE WHEN pp.Weight IS NOT NULL THEN CAST(pp.Weight as varchar) + ' kg'  ELSE NULL END Weight,
CASE WHEN pp.Height IS NOT NULL THEN CAST(pp.Height as varchar) + ' cm'  ELSE NULL END Height,
pp.DiscloseDOB, pp.DiscloseWeight, pp.DiscloseHeight,
pp.Ethnicity, dbo.udf_Haversine(pp.latitude, pp.longitude, @lat, @long) Distance , FLOOR( DATEDIFF(DAY, pp.DOB, getdate()) / 365.25) Age,
pp.Online, 'friend' Friend,
pp.Longitude, pp.Latitude, ISNULL(pp.ShowOnMap,1) ShowOnMap,
dbo.ReturnPicturePaths(pp.ProfileID) Pictures,
dbo.LastOnlineStr(DATEDIFF(SECOND,pp.LastOnline,GETDATE())) LastOnline, ISNULL(pp.VIP,0) VIP

FROM FriendsList f INNER JOIN Profiles p ON f.ProfileID = p.ProfileID JOIN
Profiles pp ON f.FriendProfileID = pp.ProfileID
WHERE p.ProfileID = @ProfileID AND @Username = ''

UNION ALL

CASE WHEN DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE DOB END DOB,
CASE WHEN Weight IS NOT NULL THEN CAST(Weight as varchar) + ' kg' ELSE NULL END Weight,
CASE WHEN Height IS NOT NULL THEN CAST(Height as varchar) + ' cm' ELSE NULL END Height,
DiscloseDOB, DiscloseWeight, DiscloseHeight,
Ethnicity, Distance, Age, Online, 'no friend' as Friend, Longitude, Latitude, ShowOnMap,
dbo.ReturnPicturePaths(ProfileID) Pictures,
dbo.LastOnlineStr(DATEDIFF(SECOND,LastOnline,GETDATE())) LastOnline, VIP
FROM
(

CASE WHEN DOB = '1900-01-01 00:00:00.000' THEN NULL ELSE DOB END DOB,
CASE WHEN Weight IS NOT NULL THEN CAST(Weight as varchar)  ELSE NULL END Weight,
CASE WHEN Height IS NOT NULL THEN CAST(Height as varchar)  ELSE NULL END Height,
DiscloseDOB, DiscloseWeight, DiscloseHeight,
Ethnicity, dbo.udf_Haversine(latitude, longitude, @lat, @long) Distance , FLOOR( DATEDIFF(DAY, DOB, getdate()) / 365.25) Age,
Online, LastOnline,
p.Longitude, p.Latitude, ISNULL(ShowOnMap,1) ShowOnMap, ISNULL(VIP,0) VIP

FROM Profiles p
LEFT OUTER JOIN BlockUserList b ON b.BlockProfileID = p.ProfileID AND b.ProfileID = @ProfileID
LEFT OUTER JOIN ProfilePictures pp ON p.ProfileID = pp.ProfileID
WHERE
Latitude >= @lat - @max_distance
and Latitude <= @lat + @max_distance
and Longitude >= @long - @max_distance
and Longitude <= @long + @max_distance AND
dbo.udf_Haversine(latitude, longitude, @lat, @long) <= @max_distance AND
(@Keywords = '' OR (AboutMe LIKE '%' + @Keywords + '%' OR
Interests LIKE '%' + @Keywords + '%')) AND
(@ethnicity = 'Any' OR ethnicity = @ethnicity) AND
(@ShowOnline = 0 OR [Online] = 1) AND
(@name = '' OR FirstName + ' ' + Lastname = @name) AND
(@HomeTown = '' OR HomeTown = @HomeTown) AND
(@email = '' OR Email = @email) AND
(DOB IS NULL OR DATEDIFF(yy, ISNULL(DOB,DATEADD(yy,-21,GETDATE())), GETDATE()) BETWEEN  Cast(@minAge as int) and cast(@maxAge as int)) AND
((@minimumWeight <= 40 AND @maximumWeight >= 200) OR (ISNULL(Weight,41) >= @minimumWeight AND ISNULL(Weight,41) <= @maximumWeight)) AND
((@minimumHeight <= 99 AND @maximumHeight >= 250) OR (ISNULL(Height,251) >= @minimumHeight AND ISNULL(Height,251) <= @maximumHeight))
AND (@ShowPicsOnly = 0 OR pp.ProfileID IS NOT NULL)
AND p.ProfileID NOT IN (SELECT FriendProfileID FROM FriendsList WHERE ProfileID = @ProfileID AND @Username = '')
ORDER BY Distance ASC

) a
ORDER BY Friend, Distance ASC, [Online] desc, LastOnline DESC
``````
I suggest doing a subquery against your Profiles table (line 61) to remove any non-matches before you do the left outer joins. This should eliminate some of the records prior to doing the distance calculation.

Also, since distance would be measured in miles/km/feet, it makes little sense to do the "Latitude >= @lat - @max_distance", as latitude should be in degrees.
Billing EngineerCommented:
only a explain plan shall show the issues.
also, starting to "debug" with such a large query is usually problematic...

try to start smaller, to see as from which point the query goes "bad"
Author Commented:
angellll - it runs instantly if we remove the haversine function.

So it's clearly the distance part which is taking the time to compute.
Commented:
Hi there,

I notice that you have SQL Server 2005 and 2008 - are you using a db that needs to run on both?  On 2008 (not 2005), you can switch to the geospatial types which will allow you to index by lat-lng which will pretty much solve your problem once you start using the spatial queries.

http://www.microsoft.com/sqlserver/2008/en/us/spatial-data.aspx
"Build high-performance solutions with spatial data indexing"
Author Commented:
Hi Shaun, it's a good idea but the default search is pretty much everyone ordered by top 300 by distance.

As you can see all the WHERE clauses are optional parameters and by default these are generally never filled out so it wouldn't eliminate many records.

How could we be a bit more clever with limiting the records by using some clever math on lat/long? If we limited it by country it would be good enough
Author Commented:
Not actually heard of that before cyberkiwi, will have to check that out as we're using 2008
Commented:
>> clever math on lat/long?
You would have to be very clever indeed because the dynamics of lat changes and lon changes as you move nearer the poles or equator.
Near the equator, 1 lon covers a huge distance but at the pole, 1 lon is nothing but 1 lat is still as far as it always is.

Here's a function I wrote using some free pre-referenced lat by lon values.
I hid the function and did not want to show my preset table, but you'd be best to store it in a table permanently.
Even if you continue to use the haversine method, see if this is faster - maybe it can help pre-filter to 3x the candidates required for the actual haversine calculations and top.
``````CREATE function [dbo].[fod] (
@lat1 float,
@lon1 float,
@lat2 float,
@lon2 float
) returns float as
begin

declare @ref table (latitude int, latdelta float, londelta float)
insert into @ref values (-90,111694.2,0)
insert into @ref values (-89,111693.599999999,1949.4)
insert into @ref values (-88,111692.399999999,3898.19999999999)
insert into @ref values (-87,111690.6,5845.8)
insert into @ref values (-86,111688.2,7791)
insert into @ref values (-85,111685.2,9734.4)
insert into @ref values (-84,111681.599999999,11674.8)
insert into @ref values (-83,111677.399999999,13611.6)
insert into @ref values (-82,111672,15543.6)
insert into @ref values (-81,111666.599999999,17471.4)
insert into @ref values (-80,111660,19393.2)
insert into @ref values (-79,111652.8,21309.6)
insert into @ref values (-78,111645,23219.4)
insert into @ref values (-77,111636.599999999,25121.4)
insert into @ref values (-76,111628.2,27015.5999999999)
insert into @ref values (-75,111618.599999999,28902)
insert into @ref values (-74,111608.4,30779.4)
insert into @ref values (-73,111597.6,32646.6)
insert into @ref values (-72,111586.2,34504.2)
insert into @ref values (-71,111574.2,36351)
insert into @ref values (-70,111562.2,38186.4)
insert into @ref values (-69,111549,40010.4)
insert into @ref values (-68,111535.8,41821.7999999999)
insert into @ref values (-67,111522,43620)
insert into @ref values (-66,111507.6,45405)
insert into @ref values (-65,111492.6,47175.5999999999)
insert into @ref values (-64,111477.6,48931.7999999999)
insert into @ref values (-63,111461.4,50673)
insert into @ref values (-62,111445.8,52398)
insert into @ref values (-61,111429,54107.3999999999)
insert into @ref values (-60,111412.2,55800)
insert into @ref values (-59,111395.399999999,57475.1999999999)
insert into @ref values (-58,111377.399999999,59133)
insert into @ref values (-57,111360,60772.1999999999)
insert into @ref values (-56,111342,62392.8)
insert into @ref values (-55,111323.4,63994.1999999999)
insert into @ref values (-54,111304.799999999,65575.8)
insert into @ref values (-53,111286.2,67137)
insert into @ref values (-52,111267.6,68677.8)
insert into @ref values (-51,111248.4,70197.6)
insert into @ref values (-50,111229.2,71695.8)
insert into @ref values (-49,111210,73171.8)
insert into @ref values (-48,111190.2,74625.6)
insert into @ref values (-47,111171,76056)
insert into @ref values (-46,111151.2,77463)
insert into @ref values (-45,111132,78846.5999999999)
insert into @ref values (-44,111112.2,80206.1999999999)
insert into @ref values (-43,111093,81541.1999999999)
insert into @ref values (-42,111073.2,82851)
insert into @ref values (-41,111054,84135)
insert into @ref values (-40,111034.799999999,85393.8)
insert into @ref values (-39,111015.6,86626.1999999999)
insert into @ref values (-38,110996.4,87832.1999999999)
insert into @ref values (-37,110977.8,89011.8)
insert into @ref values (-36,110959.2,90163.8)
insert into @ref values (-35,110940.6,91288.1999999999)
insert into @ref values (-34,110922.6,92385)
insert into @ref values (-33,110904.6,93453)
insert into @ref values (-32,110886.599999999,94493.4)
insert into @ref values (-31,110869.2,95504.3999999999)
insert into @ref values (-30,110852.399999999,96486)
insert into @ref values (-29,110835.6,97438.8)
insert into @ref values (-28,110819.399999999,98361.5999999999)
insert into @ref values (-27,110803.8,99255)
insert into @ref values (-26,110788.2,100117.8)
insert into @ref values (-25,110772.6,100950)
insert into @ref values (-24,110758.2,101751.599999999)
insert into @ref values (-23,110743.8,102522.6)
insert into @ref values (-22,110730,103262.399999999)
insert into @ref values (-21,110716.8,103970.399999999)
insert into @ref values (-20,110704.2,104647.2)
insert into @ref values (-19,110692.2,105292.2)
insert into @ref values (-18,110680.2,105904.799999999)
insert into @ref values (-17,110669.399999999,106485.6)
insert into @ref values (-16,110658.599999999,107034.6)
insert into @ref values (-15,110648.4,107550.6)
insert into @ref values (-14,110639.399999999,108034.2)
insert into @ref values (-13,110630.399999999,108484.799999999)
insert into @ref values (-12,110622,108902.399999999)
insert into @ref values (-11,110614.799999999,109287.6)
insert into @ref values (-10,110607.6,109639.2)
insert into @ref values (-9,110601.599999999,109957.8)
insert into @ref values (-8,110595.6,110243.4)
insert into @ref values (-7,110590.8,110495.399999999)
insert into @ref values (-6,110586.599999999,110713.8)
insert into @ref values (-5,110583,110898.599999999)
insert into @ref values (-4,110579.399999999,111050.399999999)
insert into @ref values (-3,110577.6,111168)
insert into @ref values (-2,110575.8,111252)
insert into @ref values (-1,110574.6,111302.399999999)
insert into @ref values (0,110574,111319.2)
insert into @ref
select -latitude, latdelta, londelta
from @ref where latitude <> 0

declare @distance float
-- make it easy, swap so that x1 <= x2 and y1 <= y2
-- note: @distance is used here as swap var only, no significance
if @lon1 > @lon2 select @distance = @lon2, @lon2 = @lon1, @lon1 = @distance
if @lat1 > @lat2 select @distance = @lat2, @lat2 = @lat1, @lat1 = @distance

SELECT @distance = Sqrt(x * x + y * y)
FROM   (SELECT SUM(weight * latdelta) AS x,
case sum(weight)
when 0.0 then sum(londelta * (@lon2 - @lon1))
else SUM(weight * londelta * (@lon2 - @lon1))/SUM(weight)
end AS y
FROM   (SELECT CASE
WHEN wgt1 <= 1
AND wgt2 <= 1 THEN (wgt1 + wgt2 - 1)
WHEN wgt1 <= 1 THEN wgt1
WHEN wgt2 <= 1 THEN wgt2
ELSE 1
END AS weight,
latdelta,
londelta
FROM   (SELECT lat1.latitude - @lat1 + 0.5 AS wgt1,
@lat2 + 0.5 - lat2.latitude AS wgt2,
lat1.latdelta,
lat1.londelta
FROM   @ref lat1
INNER JOIN @ref lat2
ON lat1.latitude = lat2.latitude
WHERE  lat1.latitude > @lat1 - 0.5
AND lat2.latitude < @lat2 + 0.5) RAW) weighted) xydistances
return @distance
end
``````
It is true that you may not remove many rows from your data, but you may be running the udf_Haversine twice, once in the select clause and once in the where clause. If you use a subquery and include the calculation as a field, you can then limit the distances in the outer query.
Author Commented:
Hey Shaun,

But we don't run put it in the where clause.

Commented:
Take a look at the STDistance method in Books Online. It will do the calculation for you using the Geography type.

Eg:

DECLARE @g geography;
DECLARE @h geography;
SET @g = geography::STGeomFromText('LINESTRING(-122.360 47.656, -122.343 47.656)', 4326);
SET @h = geography::STGeomFromText('POINT(-122.34900 47.65100)', 4326);
SELECT @g.STDistance(@h);

(untested code)
Author Commented:

I meant to say we don't put the haversine function within the where clause
Author Commented:
Ok, getting there then...

DECLARE @KMperNM float = 1.0/1.852;
DECLARE @nwi geography = geography::STGeomFromText('POINT(-1.2167 52.6164)', 4326);
DECLARE @edi geography = geography::STGeomFromText('POINT(-1.1306 52.6299)', 4326);
SELECT (@nwi.STDistance(@edi)/1000.0) * @KMperNM;

This returns a correct value, however how would that fit into an inline query, as above.

Help
Commented:
You would need to make the column a Geography type and then you can use it in your queries:

SELECT Col1.STDistance(Col2)
FROM ...
Author Commented:
Do the lat long fields have to be a geography type?

If so will cause any issues if we convert them from varchar to geography?
Commented:
Create a new column in the table of Geography type, populate it from lat/lng, then create a spatial index over it.
Commented:
Hi,

This should cut down the size considerably, assuming your data is spread out wide

>>which are outside of say 100miles.

You have  @max_distance int = 200, which means 200 lat degrees... quite useless

Latitude >= @lat - @max_distance
and Latitude <= @lat + @max_distance
and Longitude >= @long - @max_distance
and Longitude <= @long + @max_distance AND

But if you change that to

FROM Profiles p
INNER JOIN ref on ref.Latitude = -round(@Lat,0)
then
-- 1.6 is for miles to kms, or adjust latdelta and londelta in table ref
Latitude >= @lat - ref.latdelta * @max_distance * 1.6 * 1.05 (+5% allowance)
and Latitude <= @lat + ref.latdelta * @max_distance * 1.6 * 1.05
and Longitude >= @long - ref.londelta * @max_distance * 1.6 * 1.05
and Longitude <= @long + ref.londelta * @max_distance * 1.6 * 1.05 AND

You will considerably cut down on the records required to evaluate haversine/spatial query.
If you are good with maths, you can tune it further using

(lat-@lat)*(lon-@lon)    against    @max_distance*@max_distance

I believe SQL 2008 spatial types uses pre-filters internally, but if it doesn't, adding this still wouldn't hurt.
``````create table ref (latitude int, latdelta float, londelta float)
insert into ref values (-90,111694.2,0)
insert into ref values (-89,111693.599999999,1949.4)
etc (data in comment far above)
``````

Experts Exchange Solution brought to you by