[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
  • Status: Solved
  • Priority: Medium
  • Security: Public
  • Views: 295
  • Last Modified:

Questions regarding SQL performance

Hi Experts,

I have the following questions re sql performance

1- When filtering for a column that is an int, and looking for values 0 or null, what is better for performance to write (columnname = 0 or columnname is null) or isnull(columnname,0)=0.

2- When having two or more conditions in the where clause, does it makes a difference which goes first?

3- Also in the where clause, when having multiple conditions from type where exists (select 1 from tablename where id = ...) and exists (select 1 from tablename where id = ...)  and so on, does it makes a difference which table goes first?

4- And on above example is it better to have where exists (select 1 ...)  or  select null?

5- When checking for a single value is it ok to to write columnnane in ('value') or its better to have columnname = 'value'?
0
bfuchs
Asked:
bfuchs
  • 5
  • 4
  • 3
6 Solutions
 
PortletPaulCommented:
Some of the answers to those questions will be opinion based, so others may disagree with what I say here.

1. This is the better form:  (columnname = 0 or columnname is null)
as a general rule it is better to avoid functions that operate on each row of the resultset.
but, in practice the choice between the 2 options would make no discernible difference to performance and no differnce in outcome.

2. No. (Well it "should not" make any difference)

3. see 2

4. I have debated this with others. I used to use "select 1" but now prefer "select null"
but neither make any difference. Even "select *" is the same result (or should be)
EXISTS really doesn't need to return any data and to an extent the select clause is redundant, but due to the rules of SQL it has to be present.

+edit:
regarding: 1

sargable: Search ARGument ABLE
e.g. see http://en.wikipedia.org/wiki/Sargable

Rules of thumb
Avoid applying functions on data values in a sql condition.
Avoid non-sargable predicates and replace them with sargable equivalents.
0
 
Scott PletcherSenior DBACommented:
1: ALWAYS (columnname = 0 or columnname is null)
    This does make a huge difference when the column is indexed.
    Indeed, the rule is NEVER use ISNULL() in a WHERE or JOIN clause.

2. Typically no.  SQL will rearrange the order in which it checks the conditions anyway, if it thinks that's better.

3. Perhaps.  Put the most likely match first, down to the least likely.

4. Not enough difference, if any, to worry about.

5. Either is fine, since they are processed the same.  An IN will get converted by SQL into multiple ORs anyway.
0
 
bfuchsAuthor Commented:
Hi PortletPaul,

What about question#5 (added later)?
The reason for my asking this is that my SQL statement gets build dynamically and if I can have columnname  in ('value') with no extra cost I would prefer that rather then having to determine how many values users have selected.

Some of the answers to those questions will be opinion based
Is there a way to verify these things by sql plan or other SQL utility?
0
Industry Leaders: We Want Your Opinion!

We value your feedback.

Take our survey and automatically be enter to win anyone of the following:
Yeti Cooler, Amazon eGift Card, and Movie eGift Card!

 
Scott PletcherSenior DBACommented:
Here's a quick script to verify point #1:

--create test table and load 1M test rows

IF OBJECT_ID('tempdb.dbo.#test_isnull') IS NOT NULL
    DROP TABLE #test_isnull

CREATE TABLE #test_isnull (
    test_value datetime NULL
    )
CREATE CLUSTERED INDEX test_isnull_cl on #test_isnull ( test_value )

;WITH
cteTally10 AS (
    SELECT 0 AS tally UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
    SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0
),
cteTally100 AS (
    SELECT 1 AS tally
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
),
cteTally10K AS (
    SELECT 1 AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
),
cteTally1Mil AS (
    SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally10K c2
)
INSERT INTO #test_isnull ( test_value )
SELECT CASE WHEN t.tally % 100000 = 0 THEN NULL ELSE DATEADD(MINUTE, t.tally, '20000101') END
FROM cteTally1Mil t
SELECT 'Look at the "Messages" tab for query stats!' AS msg, COUNT(*) AS test_isnull_row_count FROM #test_isnull

--check stats for same WHERE condition, one using (IS NULL OR ...) the other ISNULL().
SET STATISTICS IO ON

SELECT *
FROM #test_isnull
WHERE test_value IS NULL OR test_value = '20000101'

SELECT *
FROM #test_isnull
WHERE ISNULL(test_value, '20000101') = '20000101'

SET STATISTICS IO OFF

IF OBJECT_ID('tempdb.dbo.#test_isnull') IS NOT NULL
    DROP TABLE #test_isnull
0
 
PortletPaulCommented:
Scott's point about Q1 & indexes is very valid, my bad.
0
 
PortletPaulCommented:
sorry I missed point 5

there is no difference.

IN ()  is really a "shortcut" and SQL will treat each member of the list as an equal predicate join by OR

i.e.

where [fieldx] IN (1)

is the same as

where [fieldx] = 1

I don't like using IN () for single values and see no benefit in doing it.

where [fieldx] IN (1,2)

is the same as

where ( [fieldx] = 1 OR [fieldx] = 2 )
0
 
bfuchsAuthor Commented:
So here goes the SQL team-:)

@ScottPletcher,
3. Perhaps.  Put the most likely match first, down to the least likely.
I see this is the topic you have different opinions..
Perhaps you have a script to verify that too?

Just to clarify, most likely match means the one that will narrow most the results?

Wouldn't it also depend on size of table's or other factors?
0
 
Scott PletcherSenior DBACommented:
It could.  But that gets so complex it'd be just about impossible for a person to accurately pre-determine it.

Coding in most-likely order is generally more known.  And is best if SQL executes them in the order written, which it's likely to do unless it can determine a specific advantage to running them differently.  A quick mathematical calc can verify that it's better to check for the most common conditions first.
0
 
PortletPaulCommented:
With respect to 3, it is not possible to guarantee an optimizer will perform the operations in the order presented to it by a sql script. You CAN influence the sequence (e.g. by using a derived table, or by using hints).

I recall being told many years ago that "the where clause prefers equals"
that is, do exact matches in preference to anything else

e.g.
where [string_field]= like "blah%" and [boolean_field]=1

would be better as:

where [boolean_field]=1 and [string_field]= like "blah%"

but I cannot guarantee the optimizer will agree with me

---
Performance optimization is a very large topic
0
 
bfuchsAuthor Commented:
Thank you guys, looking forward to work with you in my efforts to improve our database.
0
 
bfuchsAuthor Commented:
Hi SQL Experts,

Perhaps you can help me with the issue below, very likely the solution lies in a SQL\Database setting..

http://www.experts-exchange.com/Database/MS_Access/Q_28611342.html#a40609627

Thanks,
Ben
0
 
bfuchsAuthor Commented:
Hi Experts,

When you have a chance, please take a look at the following.

http://www.experts-exchange.com/questions/28671817/SQL-Query-question.html

Thanks,
Ben
0

Featured Post

Windows Server 2016: All you need to know

Learn about Hyper-V features that increase functionality and usability of Microsoft Windows Server 2016. Also, throughout this eBook, you’ll find some basic PowerShell examples that will help you leverage the scripts in your environments!

  • 5
  • 4
  • 3
Tackle projects and never again get stuck behind a technical roadblock.
Join Now