Solved

Questions regarding SQL performance

Posted on 2015-02-12
12
261 Views
Last Modified: 2015-05-12
Hi Experts,

I have the following questions re sql performance

1- When filtering for a column that is an int, and looking for values 0 or null, what is better for performance to write (columnname = 0 or columnname is null) or isnull(columnname,0)=0.

2- When having two or more conditions in the where clause, does it makes a difference which goes first?

3- Also in the where clause, when having multiple conditions from type where exists (select 1 from tablename where id = ...) and exists (select 1 from tablename where id = ...)  and so on, does it makes a difference which table goes first?

4- And on above example is it better to have where exists (select 1 ...)  or  select null?

5- When checking for a single value is it ok to to write columnnane in ('value') or its better to have columnname = 'value'?
0
Comment
Question by:bfuchs
  • 5
  • 4
  • 3
12 Comments
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
Comment Utility
Some of the answers to those questions will be opinion based, so others may disagree with what I say here.

1. This is the better form:  (columnname = 0 or columnname is null)
as a general rule it is better to avoid functions that operate on each row of the resultset.
but, in practice the choice between the 2 options would make no discernible difference to performance and no differnce in outcome.

2. No. (Well it "should not" make any difference)

3. see 2

4. I have debated this with others. I used to use "select 1" but now prefer "select null"
but neither make any difference. Even "select *" is the same result (or should be)
EXISTS really doesn't need to return any data and to an extent the select clause is redundant, but due to the rules of SQL it has to be present.

+edit:
regarding: 1

sargable: Search ARGument ABLE
e.g. see http://en.wikipedia.org/wiki/Sargable

Rules of thumb
Avoid applying functions on data values in a sql condition.
Avoid non-sargable predicates and replace them with sargable equivalents.
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 250 total points
Comment Utility
1: ALWAYS (columnname = 0 or columnname is null)
    This does make a huge difference when the column is indexed.
    Indeed, the rule is NEVER use ISNULL() in a WHERE or JOIN clause.

2. Typically no.  SQL will rearrange the order in which it checks the conditions anyway, if it thinks that's better.

3. Perhaps.  Put the most likely match first, down to the least likely.

4. Not enough difference, if any, to worry about.

5. Either is fine, since they are processed the same.  An IN will get converted by SQL into multiple ORs anyway.
0
 
LVL 3

Author Comment

by:bfuchs
Comment Utility
Hi PortletPaul,

What about question#5 (added later)?
The reason for my asking this is that my SQL statement gets build dynamically and if I can have columnname  in ('value') with no extra cost I would prefer that rather then having to determine how many values users have selected.

Some of the answers to those questions will be opinion based
Is there a way to verify these things by sql plan or other SQL utility?
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 250 total points
Comment Utility
Here's a quick script to verify point #1:

--create test table and load 1M test rows

IF OBJECT_ID('tempdb.dbo.#test_isnull') IS NOT NULL
    DROP TABLE #test_isnull

CREATE TABLE #test_isnull (
    test_value datetime NULL
    )
CREATE CLUSTERED INDEX test_isnull_cl on #test_isnull ( test_value )

;WITH
cteTally10 AS (
    SELECT 0 AS tally UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
    SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0
),
cteTally100 AS (
    SELECT 1 AS tally
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
),
cteTally10K AS (
    SELECT 1 AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
),
cteTally1Mil AS (
    SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally10K c2
)
INSERT INTO #test_isnull ( test_value )
SELECT CASE WHEN t.tally % 100000 = 0 THEN NULL ELSE DATEADD(MINUTE, t.tally, '20000101') END
FROM cteTally1Mil t
SELECT 'Look at the "Messages" tab for query stats!' AS msg, COUNT(*) AS test_isnull_row_count FROM #test_isnull

--check stats for same WHERE condition, one using (IS NULL OR ...) the other ISNULL().
SET STATISTICS IO ON

SELECT *
FROM #test_isnull
WHERE test_value IS NULL OR test_value = '20000101'

SELECT *
FROM #test_isnull
WHERE ISNULL(test_value, '20000101') = '20000101'

SET STATISTICS IO OFF

IF OBJECT_ID('tempdb.dbo.#test_isnull') IS NOT NULL
    DROP TABLE #test_isnull
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Scott's point about Q1 & indexes is very valid, my bad.
0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 250 total points
Comment Utility
sorry I missed point 5

there is no difference.

IN ()  is really a "shortcut" and SQL will treat each member of the list as an equal predicate join by OR

i.e.

where [fieldx] IN (1)

is the same as

where [fieldx] = 1

I don't like using IN () for single values and see no benefit in doing it.

where [fieldx] IN (1,2)

is the same as

where ( [fieldx] = 1 OR [fieldx] = 2 )
0
How to improve team productivity

Quip adds documents, spreadsheets, and tasklists to your Slack experience
- Elevate ideas to Quip docs
- Share Quip docs in Slack
- Get notified of changes to your docs
- Available on iOS/Android/Desktop/Web
- Online/Offline

 
LVL 3

Author Comment

by:bfuchs
Comment Utility
So here goes the SQL team-:)

@ScottPletcher,
3. Perhaps.  Put the most likely match first, down to the least likely.
I see this is the topic you have different opinions..
Perhaps you have a script to verify that too?

Just to clarify, most likely match means the one that will narrow most the results?

Wouldn't it also depend on size of table's or other factors?
0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 250 total points
Comment Utility
It could.  But that gets so complex it'd be just about impossible for a person to accurately pre-determine it.

Coding in most-likely order is generally more known.  And is best if SQL executes them in the order written, which it's likely to do unless it can determine a specific advantage to running them differently.  A quick mathematical calc can verify that it's better to check for the most common conditions first.
0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 250 total points
Comment Utility
With respect to 3, it is not possible to guarantee an optimizer will perform the operations in the order presented to it by a sql script. You CAN influence the sequence (e.g. by using a derived table, or by using hints).

I recall being told many years ago that "the where clause prefers equals"
that is, do exact matches in preference to anything else

e.g.
where [string_field]= like "blah%" and [boolean_field]=1

would be better as:

where [boolean_field]=1 and [string_field]= like "blah%"

but I cannot guarantee the optimizer will agree with me

---
Performance optimization is a very large topic
0
 
LVL 3

Author Closing Comment

by:bfuchs
Comment Utility
Thank you guys, looking forward to work with you in my efforts to improve our database.
0
 
LVL 3

Author Comment

by:bfuchs
Comment Utility
Hi SQL Experts,

Perhaps you can help me with the issue below, very likely the solution lies in a SQL\Database setting..

http://www.experts-exchange.com/Database/MS_Access/Q_28611342.html#a40609627

Thanks,
Ben
0
 
LVL 3

Author Comment

by:bfuchs
Comment Utility
Hi Experts,

When you have a chance, please take a look at the following.

http://www.experts-exchange.com/questions/28671817/SQL-Query-question.html

Thanks,
Ben
0

Featured Post

Threat Intelligence Starter Resources

Integrating threat intelligence can be challenging, and not all companies are ready. These resources can help you build awareness and prepare for defense.

Join & Write a Comment

Entering a date in Microsoft Access can be tricky. A typo can cause month and day to be shuffled, entering the day only causes an error, as does entering, say, day 31 in June. This article shows how an inputmask supported by code can help the user a…
Read about achieving the basic levels of HRIS security in the workplace.
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

10 Experts available now in Live!

Get 1:1 Help Now