Solved

SQL SERVER QUERY - with LEFT OUTER JOIN and WHERE CLAUSE

Posted on 2010-08-16
10
459 Views
Last Modified: 2012-05-10
I would like to thank you for your help ahead of time.

I have the following original query which run for almost 2 minutes before it returned values; so, I modified it by moving the where clause to the bottom (please see the original and modified code below) at which time it run under a second!  

The problem, however, is that it returns "status" values that are different from the ones specified in the AND ... criteria (in addition to the correct ones).   I will appreciate an explanation of why moving the where clause transformed the execution time, and most importantly, why I am getting "status values that are outside of what I "asked" for.

The original and modified code is listed below (in that order):

original code:

SELECT DISTINCT TOP 251 a.city, a.property_type, a.mlsnum, a.status, mls.dbo.fn_sort_status(a.status)as sortOrder
FROM mls.dbo.mls_unified_svo_tbl a (nolock)
LEFT OUTER JOIN mls.dbo.mls_unified_mvo_svo_tbl m (nolock)
ON a.mlsnum= m.mlsnum
LEFT OUTER JOIN mls.dbo.photos_exist b (nolock)
ON a.mlsnum= b.mlsnum
LEFT OUTER JOIN mls.dbo.open_house_list d (nolock)
ON a.mlsnum = d.mlsnum
LEFT JOIN dbo.SCH_SaleType o
ON o.mlsNum = a.mlsnum
WHERE a.city LIKE'los angeles%'
AND a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365
OR a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365

modified code:

SELECT DISTINCT TOP 251 a.city, a.property_type, a.mlsnum, a.status, mls.dbo.fn_sort_status(a.status)as sortOrder
FROM mls.dbo.mls_unified_svo_tbl a (nolock)
LEFT OUTER JOIN mls.dbo.mls_unified_mvo_svo_tbl m (nolock)
ON a.mlsnum= m.mlsnum
LEFT OUTER JOIN mls.dbo.photos_exist b (nolock)
ON a.mlsnum= b.mlsnum
LEFT OUTER JOIN mls.dbo.open_house_list d (nolock)
ON a.mlsnum = d.mlsnum
LEFT JOIN dbo.SCH_SaleType o
ON o.mlsNum = a.mlsnum
AND a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365
OR a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365
WHERE a.city LIKE'los angeles%'
0
Comment
Question by:dteshome
  • 3
  • 3
  • 2
  • +2
10 Comments
 
LVL 58

Accepted Solution

by:
cyberkiwi earned 125 total points
ID: 33450776
>> "status values that are outside of what I "asked" for.
Because it has become a left join condition. IF it does not match, all it does is NOT-JOIN to SCH_SaleType instead of REMOVING rows (which is the task of WHERE clause)

Write it as follows:

SELECT DISTINCT TOP 251 a.city, a.property_type, a.mlsnum, a.status, mls.dbo.fn_sort_status(a.status)as sortOrder
FROM mls.dbo.mls_unified_svo_tbl a (nolock)
LEFT OUTER JOIN mls.dbo.mls_unified_mvo_svo_tbl m (nolock)
ON a.mlsnum= m.mlsnum
LEFT OUTER JOIN mls.dbo.photos_exist b (nolock)
ON a.mlsnum= b.mlsnum
LEFT OUTER JOIN mls.dbo.open_house_list d (nolock)
ON a.mlsnum = d.mlsnum
LEFT JOIN dbo.SCH_SaleType o
ON o.mlsNum = a.mlsnum
WHERE a.city LIKE'los angeles%'
AND a.statusdate >= dateadd(day, -365, getdate())
AND a.status in (10,20)
0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33450779
This will be able to utilize an index on a.statusdate - make sure you have one.
0
 
LVL 32

Assisted Solution

by:ewangoya
ewangoya earned 50 total points
ID: 33450805
...
LEFT JOIN dbo.SCH_SaleType o
ON o.mlsNum = a.mlsnum
AND a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365
OR a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365
 
Basically thse could be intepreted as
LEFT JOIN dbo.SCH_SaleType o  ON (o.mlsNum = a.mlsnum
                                                            AND a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365
                                                           OR a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365 )

Try modifying it to the following and see if it makes a difference. Note the extra parenthesis i included in the where clause. Indexing the status and city fileds may also increase the speed if you have not done so yet

SELECT DISTINCT TOP 251 a.city, a.property_type, a.mlsnum, a.status, mls.dbo.fn_sort_status(a.status)as sortOrder
FROM mls.dbo.mls_unified_svo_tbl a (nolock)
LEFT OUTER JOIN mls.dbo.mls_unified_mvo_svo_tbl m (nolock)
ON a.mlsnum= m.mlsnum
LEFT OUTER JOIN mls.dbo.photos_exist b (nolock)
ON a.mlsnum= b.mlsnum
LEFT OUTER JOIN mls.dbo.open_house_list d (nolock)
ON a.mlsnum = d.mlsnum
LEFT JOIN dbo.SCH_SaleType o
ON o.mlsNum = a.mlsnum
WHERE a.city LIKE'los angeles%'
AND (a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365)
OR (a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365)

0
The Eight Noble Truths of Backup and Recovery

How can IT departments tackle the challenges of a Big Data world? This white paper provides a roadmap to success and helps companies ensure that all their data is safe and secure, no matter if it resides on-premise with physical or virtual machines or in the cloud.

 
LVL 7

Assisted Solution

by:jhp333
jhp333 earned 75 total points
ID: 33450830
It seems the OR is misused.
Your original where condition will be understood by the server as:

WHERE
(a.city LIKE'los angeles%'AND a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365)
OR
(a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365)

So, the lower parts does not have any condition on city, it need to search nationwide. That's why it takes long.

The correct condition would be:
WHERE a.city LIKE 'los angeles%'
AND (
        a.status =(10) AND DATEDIFF(day, a.statusdate,getdate()) < 365
        OR
        a.status =(20) AND DATEDIFF(day, a.statusdate,getdate()) < 365
)

When you moved up some of the conditions, the OR part then used as a part of the join condition of the last LEFT (outer) JOIN, and since its "LEFT" outer join, any condition that applies only to the left table will be simply ignored by the server, because it does not affect the outer join at all.
0
 
LVL 7

Expert Comment

by:jhp333
ID: 33450842
cyberkiwi is right, in this case, you can simply use IN operator instead of the erroneous OR.
0
 
LVL 8

Expert Comment

by:mustaccio
ID: 33450855
Also, you may want to get rid of the DISTINCT, especially if the result set has more than 251 rows in it. DISTINCT causes a sort of (or, more precisely, elimination of duplicate rows from) the entire result set before returning 251 rows.
0
 
LVL 7

Expert Comment

by:jhp333
ID: 33450856
BTW, all your outer joins are unnecessary here, because the fields from those tables are not used anywhere in the SQL.
Unless you omitted part of the SELECT clause.
0
 
LVL 58

Expert Comment

by:cyberkiwi
ID: 33450880
The worst performance culprit is to perform a function on a date column.
ALWAY, always, ALWAYS (can I repeat enough?) perform the function on the other side of the test, so that SQL server can take that [constant] value that it works out once, against an index on the date column.

DATEDIFF(day, a.statusdate,getdate()) < 365    ---- bad
0
 
LVL 32

Expert Comment

by:ewangoya
ID: 33450907
Right cyberkiwi
I lways find it better to calculate the dates needed before hand and use some thing like
statusdata >= xxx and statusdata <= yyy
This way I make full use of my indexes
0
 

Author Closing Comment

by:dteshome
ID: 33451779
Just a general comment about your service;
Ingenious business model; a win-win-win (3 way) proposition
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

Title # Comments Views Activity
DTS Connection Failed 7 66
SQL Server Generate Scripts Fails 5 34
sql server computed columns 11 29
SSAS Hierarchy with columns with folder names 10 11
In this article I will describe the Copy Database Wizard method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
I have a large data set and a SSIS package. How can I load this file in multi threading?
Familiarize people with the process of utilizing SQL Server functions from within Microsoft Access. Microsoft Access is a very powerful client/server development tool. One of the SQL Server objects that you can interact with from within Microsoft Ac…
This video shows, step by step, how to configure Oracle Heterogeneous Services via the Generic Gateway Agent in order to make a connection from an Oracle session and access a remote SQL Server database table.

785 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question