Link to home
Start Free TrialLog in
Avatar of Brian Battles
Brian BattlesFlag for United States of America

asked on

SQL Server Geography query is too slow

We have 2 tables, one table is Customers, with each one's location as a Geography column, and one table is Branch Offices also with each one's location as a Geography column (we populate the Geography columns from latitude and longitude columns)

The tables are simple: Cust_Locations has CustNum (integer, PK) and CustLoc (geography, with spatial index), Branch_Locations has BranchNum (integer, PK) and BranchLoc (geography with spatial index).

We need to run a query (view) that's intended to show the closest branch office to each customer, based on Geography columns, and it runs fine with a couple of thousand customers. We just received a big job that needs to run with 700,000 customers and it takes hours to run. Can anyone suggest any ways to speed up this SQL?

WITH CLOSEST AS (
    SELECT *, ROW_NUMBER() 
    OVER (
        PARTITION BY CustNum
        ORDER BY Miles
    ) AS RowNo 
    FROM 
(
SELECT 
CustNum,
BranchNum, 
CONVERT(DECIMAL(10, 6), (BranchLoc.STDistance(CustLoc)) / 1609.344) AS Miles
FROM 
Branch_Locations
CROSS JOIN
Cust_Locations
) AS T
)
SELECT TOP 100 PERCENT CustNum, BranchNum, Miles, RowNo FROM CLOSEST WHERE RowNo = 1 ORDER BY CustNum, MILES

Open in new window


I have no ida what to try. One thought was could there be a way to put the distance comparison into the JOIN? Nothing else comes to mind so far...

Thanks for any suggestions!
Avatar of Scott Pletcher
Scott Pletcher
Flag of United States of America image

CROSS JOIN seems particularly inefficient here.

Interesting q.  If you could post some usable sample data, I'd code something up.
Have you tried looking at the execution plan? Have you reviewed your indexes on the tables?
Avatar of Brian Battles

ASKER

Yes and yes, Jordan

Also, someone else I know tried the same concept, but used their own sin/cos calculation against the latitude and longitude columns, instead of the Geography column's STDistance calculation, and it ran a zillion times faster. So is Microsoft's Geography calculation just really inefficient and slow?
Also, someone else I know tried the same concept
Hopefully they did not use "TOP 100 PERCENT"
That was in there so we could sort by the CustNum in the View
]That was in there so we could sort by the CustNum in the View
Unfortunately, that is not the way it works.  If you use TOP 100 PERCENT the ORDER BY clause is ignored as it makes no sense whatsoever.

But don't take my word for it see TOP 100 Percent ORDER BY Considered Harmful. written by one of the principal architects of SQL Server.
This question needs an answer!
Become an EE member today
7 DAY FREE TRIAL
Members can start a 7-Day Free trial then enjoy unlimited access to the platform.
View membership options
or
Learn why we charge membership fees
We get it - no one likes a content blocker. Take one extra minute and find out why we block content.