SQL Query Parsing

pzozulka
pzozulka used Ask the Experts™
on
How does SQL Server 2008 parse the following example? Does it first build all the relationships in the JOIN statement, and then record by record starts looking at the conditions in the WHERE clause?

Theory 1:
For example, it starts with record 1, and first checks if T1.Age > 25. If true, it checks the next condition in the where clause. If its false, it skips the rest of the conditions and then moves on to the next record?

Theory 2:
Does it first retrieve all records which have T1.Age > 25? Next it checks all records Which have T2.Salary > 50000. So on and so on?

SELECT *
FROM Table1 T1 JOIN Table2 T2 ON T1.Id = T2.Id
WHERE T1. Age > 25
AND T2.Salary > 50000
AND T1.Disable = 0
And T1.Id NOT IN (SELECT Id From T3)
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
SQL Server Data Dude
Most Valuable Expert 2013
Author of the Year 2015
Commented:
SQL Server processes queries as a SET, meaning the entire output rowset, and not record-by-record.

From Microsoft's article on Order of Execution (requires Microsoft account login) (likely found in a hundred of other places too)..:

1. FROM
2. ON
3. OUTER
4. WHERE
5. GROUP BY
6. CUBE | ROLLUP
7. HAVING
8. SELECT
9. DISTINCT
10. ORDER BY
11. TOP

So (doing some big summarizing here) .. it processes the FROM and JOIN tables first, then the WHERE, then does the GROUPing, then everything in the SELECT is second to dead last, followed by ORDER BY.

Author

Commented:
I'm only focused on #4, WHERE clause. Is there a specific order, conditions are checked in the where clause -- top to bottom? (Ex. 1st condition in the where clause is checked first)?

I was told that this might be the case.
Jim HornSQL Server Data Dude
Most Valuable Expert 2013
Author of the Year 2015

Commented:
No.  The WHERE clause will process as a whole, from left to right, unless you include parentheses ( ) to force an explicit order of execution.

Having said that though, the answer could change based on the indexes you have on that table.  If one column in the WHERE is indexed and the others are not, it will probably start with that column first.

If you have the ability to create indexes on this table, to insure the fastest execution consider creating a covering index that contains all columns in your SELECT and WHERE clause
Should you be charging more for IT Services?

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden using our free interactive tool and use it to determine the right price for your IT services. Start calculating Now!

Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
>>  The WHERE clause will process as a whole, from left to right, unless you include parentheses ( ) to force an explicit order of execution. <<

I don't believe that's true, and you certainly can't rely on it.

SQL may decide to "short-circuit" and do certain comparisons first, or it may "pre-analyze" some.  You have no control over that, really; even parentheses only force SQL to use the order in determining whether a given expression is true or not, not necessarily the order in which to process the individual conditions.

Author

Commented:
But generally speaking, query parsing is done one row at a time, right?
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
No.  It's all parsed ("interpreted") as a unit.

SQL itself will ultimately have to test each row for the specified conditions, but it could do it in a lot of different orders and different ways.

UDB (IBM's DBMS) will completely rewrite parts of the logic for efficiency, changing conditions to functionally equivalent, but more efficient, conditions.

SQL Server doesn't do nearly as much of that -- at least not yet -- but you can't count on SQL doing anything in a specific order or in a particular way, except as already noted for CASE WHEN conditions.

Author

Commented:
Sorry that's what I meant.

SQL itself will ultimately have to test each row for the specified conditions

I understand the SQL query (actual code) will be interpreted as a unit. What I meant was, once the query (code) is analyzed, it will be optimized by the query optimizer (ex. WHERE conditions will be reordered for improved efficiency), but at the end of the day, once that's done, the records (rows) themselves are being checked one-by-one, right?
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
Ultimately, presumably so.  But you'll need such a thing in a query plan, for example.

With RDBMSs, you need to think in terms of actions on sets of rows, not on an individual row by row basis.

Author

Commented:
Can you please elaborate further on what you mean by:
think in terms of actions on sets of rows
I've been writing SQL queries for the past 5 years, but this is the first time where I have to write them with performance in mind. This means, instead of simply thinking about getting results, I now have to think in terms of how the results are retrieved.
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
CORRECTION:
>> But you'll need such a thing in a query plan, for example. <<

But you'll never see such a thing [row by row comparisons of table columns] in a query plan, for example.


Yes, you have to make sure you don't write the SQL code such that it forces poor performance.  For that you must look at the query plan.  If it contains too many scans, esp. full table scans and/or too many bad row count estimates on large tables and/or too many other problems, such as implicit conversions or key columns not being used when they should be, you have to adjust the SQL to correct those issues.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial