[Last Call] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 573
  • Last Modified:

SQL SERVER OPTIMIZATION, QUERY OPTIMIZATION

THE CODE BELOW RETURNS 18 RECORDS AND RUNS IN 48 SECONDS.
WHEN I REMOVE THE UDF IN THE AND CLAUSE AT THE BOTTOM, IT RUNS UNDER 1 SECOND
AND RETURNS 501 RECORDS.
--
SO, I TRIED TO INSERT THE 501 RECORDS FROM THE ORIGINAL QUERY (MINUS THE UDF)
IN TO A TEMP TABLE AND THEN FILTER FROM THE TEMP TABLE USING THE SAME CRITERIA (...<= 0.25)

ALTHOUGH THERE ARE 501 RECS IN THE TEMP TABLE, I GET 0 RECS RETURNED WHEN I QUERY AND APPLY
THE SAME CRITERIA (...<= 0.25) - I EXPECTED 18 RECCS.

I AM SURE I AM GETTING A DIFFERENT SET OF RECORDS INTO THE TEMP TABLE, A DIFFERENT SET OF 501 RECS THAN THE ORIGINAL QUERY, THUS THE EMPTY SET RATHER THAN 18 RECS FROM TEMP TABLE QUERY ...

I APPRECIATE YOUR HELP OPTIMIZING THE ORIGINAL QUERY USING MY WAY OR A BETTER WAY.

THANKS,
D

-- ORIGINA Q
select distinct TOP 501 a.mlsnum
from mls_unified_svo_tbl a (nolock)  -- 46 sec, 18 rows
LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (nolock) on a.mlsnum = m.mlsnum
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum
where ((a.status = 20 AND cast(a.statusdate as datetime) >= '03/26/2008')
OR (a.status = 25 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status = 15 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status IN (5,30,10))) AND (a.property_type = 0)
AND ( dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude) <= 0.25)
--
--CREATE TEMP TABLE
--drop table #T1
--Create table #T1(mlsnum varchar(20), Latitude Decimal(18,6), Longtitude Decimal(18,6), Distance Float)
Create table #T1(mlsnum varchar(20), Latitude Decimal(18,6), Longtitude Decimal(18,6), Distance Float)
go
--
Insert Into #T1(mlsnum, latitude, Longtitude, distance)
select DISTINCT TOP 501 a.mlsnum,  a.latitude, a.longitude, dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude)
from mls_unified_svo_tbl a (nolock)  LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (nolock) on a.mlsnum = m.mlsnum
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum
where ((a.status = 20 AND cast(a.statusdate as datetime) >= '03/26/2008')
OR (a.status = 25 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status = 15 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status IN (5,30,10))) AND (a.property_type = 0)
--AND ( dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude) <= 0.25)
--
-- code below returns 0 records (I expected to see 18 as original query)
select * from #T1
where Distance <= 0.25
order by  Distance Asc
--
0
dteshome
Asked:
dteshome
  • 5
  • 3
  • 2
  • +1
1 Solution
 
Serge FournierAnalyst ProgrammerCommented:
was that query faster on sql 2000?
cause i had this problem when we migrated from sql 2000 to sql 2005
my queries were slower

the issue was called "comparing an aggregation in a where clause"

example:
BAD (and long to execute):
SELECT winner FROM nobel  WHERE count(winner)>1  GROUP BY winner

GOOD :
SELECT winner FROM nobel  GROUP BY winner  HAVING COUNT(winner)>1

i would try use more having clause in the query

i cannot really engineer it for you as i do not have a database example to work with to see the speed differences while using more HAVING in the query

0
 
Anthony PerkinsCommented:
Try it this way (and please consider losing the float in favor of a numeric data type):
CREATE TABLE #T1(
	mlsnum varchar(20), 
	Distance Float)

CREATE UNIQUE CLUSTERED INDEX IX_T1 ON #T1(mlsnum, Distance)

INSERT	#T1(
	mlsnum, distance)
SELECT	a.mlsnum,
	dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude)
FROM	mls_unified_svo_tbl a (NOLOCK)
WHERE	((a.status = 20 AND CAST(a.statusdate AS datetime) >= '03/26/2008')
         OR (a.status = 25 AND CAST(a.statusdate AS datetime) >= '08/26/2008')
         OR (a.status = 15 AND CAST(a.statusdate AS datetime) >= '08/26/2008')
         OR (a.status IN (5, 30, 10)))
	AND (a.property_type = 0) 

SELECT	DISTINCT TOP(501) 
	a.mlsnum
FROM	#T1 m
	LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (NOLOCK) ON a.mlsnum = m.mlsnum
	LEFT OUTER JOIN photos_exist b (NOLOCK) ON a.mlsnum = b.mlsnum
	LEFT OUTER JOIN virtual_tours c (NOLOCK) ON a.mlsnum = c.mlsnum
WHERE	Distance <= 0.25

DROP TABLE #T1

Open in new window

0
 
GhunaimaCommented:
Try this
select TOP 501 a.mlsnum
from mls_unified_svo_tbl a (nolock)  -- 46 sec, 18 rows
LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (nolock) on a.mlsnum = m.mlsnum 
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum 
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum 
where ((a.status = 20 AND cast(a.statusdate as datetime) >= '03/26/2008') 
OR (a.status in (25, 15) AND cast(a.statusdate as datetime) >= '08/26/2008') 
OR (a.status IN (5,30,10))) AND (a.property_type = 0) 
AND ( dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude) <= 0.25)

Open in new window

0
Efficient way to get backups off site to Azure

This user guide provides instructions on how to deploy and configure both a StoneFly Scale Out NAS Enterprise Cloud Drive virtual machine and Veeam Cloud Connect in the Microsoft Azure Cloud.

 
GhunaimaCommented:
Dropping distinct will improve the query but if it returns multiple records then try this. Also Why are you using outer joins as you are not using any of the fields from them. If you drop the joins your query will run even faster.

select TOP 501 *, dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude) 
from (select distinct a.mlsnum,  a.latitude, a.longitude from mls_unified_svo_tbl 
where ((status = 20 AND cast(statusdate as datetime) >= '03/26/2008') 
OR (status in (25, 15) AND cast(statusdate as datetime) >= '08/26/2008') 
OR (status IN (5,30,10))) AND (property_type = 0) 
) a (nolock)  
LEFT OUTER JOIN mls_unified_mvo_svo_tbl  m (nolock) on a.mlsnum = m.mlsnum 
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum 
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum

Open in new window

0
 
dteshomeAuthor Commented:
Hi, AcPerkins

The query you re-designed was much faster than the original 1 min 30 sec, as opposed to 2 min 20 sec for the original.

The original query (below)
-- runs in less than 1 sec without inline function (dist calc)
--runs in 2 min 20 sec, with inline function returns 18 recs
--
My question is is there any way to avoid returning 663667 recs (which is what gets inserted into #T1)
and we end up calculating distance for 600K when we only need TOP 501?

As I mentioned above, the original Q, with ... distinct top 501 runs under a sec when distance is not calculated.

I will appreciate your feedbacck.

Thanks!
D



Original Q
------------
select distinct TOP 501 a.mlsnum
from mls_unified_svo_tbl a (nolock)--LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (nolock) on a.mlsnum = m.mlsnum
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum
where ((a.status = 20 AND cast(a.statusdate as datetime) >= '03/26/2008')
OR (a.status = 25 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status = 15 AND cast(a.statusdate as datetime) >= '08/26/2008')
OR (a.status IN (5,30,10))) AND (a.property_type = 0)
0
 
Anthony PerkinsCommented:
So it looks like your UDF dbo.fn_GetDistance() is the bottle-neck.  Please post it here so that we can suggest how you can improve it.
0
 
dteshomeAuthor Commented:
So, there is no way to select distinct top 501 in the temp table as opposed to 600K and filtring??? maybe with group by to simulate the distinct???


Here is the UDF:
ALTER  function [dbo].[fn_GetDistance]
(@latitude1 decimal(9,6), @longitude1 decimal(9,6), @latitude2 decimal(9,6),
@longitude2 decimal(9,6))

returns float
as


begin
--if @latitude2 is null or @longitude2 is null
--    return null
declare @EquatorialRadius float
declare @RelPi float
declare @distance float
declare @a float
declare @lat1 decimal(9,6)
declare @lon1 decimal(9,6)
declare @lat2 decimal(9,6)
declare @lon2 decimal(9,6)
declare @dLat decimal(9,6)
declare @dLon decimal(9,6)

set @EquatorialRadius = 3963.1676    
set @RelPi = 3.14159 / 180

set @lat1 = @latitude1 * @RelPi
set @lon1 = @longitude1 * @RelPi
set @lat2 = @latitude2 * @RelPi
set @lon2 = @longitude2 * @RelPi

set @dLat = @lat2 - @lat1;
set @dLon = @lon2 - @lon1;


set @a = Sin(@dLat / 2) * Sin(@dLat / 2) + Cos(@lat1) * Cos(@lat2) *
Sin(@dLon / 2) * Sin(@dLon / 2);
set @distance = 2 * @EquatorialRadius * Atn2(Sqrt(@a), Sqrt(1 - @a))

return @distance
end
0
 
dteshomeAuthor Commented:
While you are looking at the UDF, I will like to pick your collective brains ...

How would you write the query below into a stored proc with the may params (in XX... below)?
because the same param, say Status, or data is used in different OR, AND, IN clauses.



select DISTINCT TOP XX501 a.mlsnum,  a.latitude, a.longitude, dbo.fn_GetDistance(XX34.051039, XX-118.341143, a.latitude, a.longitude)
from mls_unified_svo_tbl a (nolock)  LEFT OUTER JOIN mls_unified_mvo_svo_tbl m (nolock) on a.mlsnum = m.mlsnum
LEFT OUTER JOIN photos_exist b (nolock) on a.mlsnum = b.mlsnum
LEFT OUTER JOIN virtual_tours c (nolock) on a.mlsnum = c.mlsnum
where ((a.status = XX20 AND cast(a.statusdate as datetime) >= xx'03/26/2008')
OR (a.status = xx25 AND cast(a.statusdate as datetime) >= xx'08/26/2008')
OR (a.status = xx15 AND cast(a.statusdate as datetime) >= xx'08/26/2008')
OR (a.status IN xxxx(5,30,10))) AND (a.property_type = xxx0)
0
 
Anthony PerkinsCommented:
Write your function as an inline UDF() as shown below and instead of:
dbo.fn_GetDistance(34.051039, -118.341143, a.latitude, a.longitude)

Use this:
dbo.fn_GetDistance(0.594302744274998, -2.06544258592344, a.latitude * PI() / 180, a.longitude * PI() / 180)


ALTER  FUNCTION [dbo].[fn_GetDistance]
    (@lat1 float,
     @lon1 float,
     @lat2 float,
     @lon2 float)

RETURNS float

AS 

BEGIN

    RETURN	2 * 3963.1676 * 
		ATN2(SQRT(SIN((@lat2 - @lat1) / 2) * SIN((@lat2 - @lat1) / 2) + 
		COS(@lat1) * COS(@lat2) * SIN((@lon2 - @lon1) / 2) * 
		SIN((@lon2 - @lon1) / 2)), SQRT(1 - SIN((@lat2 - @lat1) / 2) * 
		SIN((@lat2 - @lat1) / 2) + COS(@lat1) * COS(@lat2) * 
		SIN((@lon2 - @lon1) / 2) * SIN((@lon2 - @lon1) / 2)))

END

Open in new window

0
 
dteshomeAuthor Commented:
Hi, AcPerkins

I tried the new UDF, amazing!  THANKS!

Here are the results:

--Old UDF Run Time                  
55 secs,      53 secs,      1 min 20 secs,      53 secs,      56 secs,      56 secs,      1 min 5 secs
--New UDF      Run Time            
20 secs,      21 secs,      20 secs,            18 secs,      18 secs,      18 secs,      27 secs
---------------------------------------------------------------------------------------------------------------------
--% Improvement                        
175%      152%      300%      194%      211%      211%      140%      


--AVERAGE IMPROVEMENT 198%
0
 
dteshomeAuthor Commented:
Thank you!
0

Featured Post

Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

  • 5
  • 3
  • 2
  • +1
Tackle projects and never again get stuck behind a technical roadblock.
Join Now