Potential Issue with UPDATE FROM??

Doug Bishop
Doug Bishop used Ask the Experts™
on
I recall back in the SQL Server 2000 days, this was considered bad and inefficient coding. Does it still hold true:

UPDATE b
SET b.something = a.something
FROM table1 a
INNER JOIN table2 b
ON a.key = b.key

The issue is that the UPDATE statement is updating table2 but table1 is in the FROM clause, and it could cause issues with performance and should be written:

UPDATE b
SET b.something = a.something
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Could there be performance issues in newer versions (e.g. 2012 and up) with this syntax? If so, can someone explain how the inefficiencies show themselves? I've looked as some query plans going both ways and they seem identical.
Comment
Watch Question

Do more with

Expert Office
EXPERT OFFICE® is a registered trademark of EXPERTS EXCHANGE®
Russ SuterSenior Software Developer

Commented:
Short answer, your 2nd example is the better way to do it.

Longer answer, it depends a lot on how things are indexed. The first approach should work fine but you're updating across the join which tends to slow things down a bit. Plug both of those into SSMS and determine the execution plan for each. It's entirely possible that the current SQL optimizer routines can properly optimize both queries just as efficiently. I'm tempted to believe, however, that the 2nd update statement will somehow be more efficient.
Senior DBA
Most Valuable Expert 2018
Top Expert 2014
Commented:
It shouldn't make any difference.  I would expect SQL to generate the same plan either way, particularly for this straight-forward a join.
This highly depends on number of rows in tables and also on the key type.
If both keys are PKs and the number of rows in both tables is approx. equal then there is no difference in timing but the execution plans will differ.  One table must be scanned row by row, rows from the 2nd one are seeked. If the number of rows is equal then SQL Server handles the JOIN in the order specified by the query author.

If the number of rows in the two tables will differ significantly but the keys are still PKs then SQL engine decides to scan the smaller table and to seek just a few rows from the large table. And the two execution plans are then equal.

If the keys are not PKs then it could result in multiple updates of one row which is better to avoid. I did not test this possibility but it would be good exercise...
C++ 11 Fundamentals

This course will introduce you to C++ 11 and teach you about syntax fundamentals.

Raja Jegan RSQL Server DBA & Architect, EE Solution Guide
Awarded 2009
Distinguished Expert 2018

Commented:
For more clarity, lets dissect your query a little bit..
UPDATE b -- This specifies the table to be updated..
SET b.something = a.something -- Since the table to be updated is specified above, b.something is not required. just SET something = a.something is sufficient
-- This is the JOIN query to select what records are to be updated based upon the joins
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Open in new window


with INNER JOIN, table1 INNER JOIN table2 or table2 INNER JOIN table1 will be treated the same. Same execution plan only will be generated..
SQL Server will generate the optimal execution plans based upon the no. of records in the tables along with latest statistics.
For OUTER JOIN, this might vary..

So, to answer your question, there won't be any performance difference between usage of table1 and table2 order for INNER JOIN after SQL Server 2005 onwards..
Doug BishopDatabase Developer

Author

Commented:
I did some testing in the environment I have available and it does not appear to make a difference. I like order, and to me it makes sense to have the target of the update in the FROM clause. I've seen queries where the table actually being updated is about the 5th one referenced in a query that has about 10 joins. VERY confusing.

It could be that the optimizer could select a different plan based on statistics, it isn't perfect. I've seen it pick a plan in production that had up-to-date statistics that totally SUCKED! By adding an index hint (shrug), the query went from almost 10 minutes to less than 45 seconds, consistently. A full table scan of tens-of-millions of rows was changed to an index seek. Phooey on SQL Server. :-)
Queries having 10 joins are always tricky... so the index hint or different query structure is the standard way how to optimize it.
Also the cardinality estimator changed in SQL 2014 and I would guess it is still changing with new versions. Some queries behave better under the new versions some don't. You may decide to use the old one or you may study execution plans and optimize it by adding hints which is OK in most of the cases.

I don't know what testing you did but the simple query from your question can produce different execution plans under certain conditions but it seems you don't believe.
Doug BishopDatabase Developer

Author

Commented:
I'm not saying I it can't produce different plans, I'm just saying in my testing environment it didn't. SQL Server is not perfect. Even an actual query plan is an estimate based on limited time to run all potential scenarios, the complexity of the query and the freshness of the statistics. If you are joining on 30 tables and each table has 10 indexes and there are millions of rows of data, the combinations that can be applied are overwhelming.
It would be nice if you could set an option to have SQL Server not limit resources it uses to estimate a plan. Have a production clone database  with limited activity on it so SQL Server can spend 30 minutes chugging through every conceivable combination and churn out a "true" compiled query plan that can be saved in the file system, not cached and thrown out as soon as space is needed.

When you have thousands of stored procedure running hourly, daily, weekly, monthly, quarterly, yearly, as well as on-demand Jobs against terabyte databases, what is the chance your weekly proc is still in cache?
Scott PletcherSenior DBA
Most Valuable Expert 2018
Top Expert 2014

Commented:
Almost none.  But a week old plan might not be that good either, no matter how long it took to create, particularly from SQL Server's viewpoint.

Who's to say you didn't add or delete 100Ks, millions (,billions?) of rows from one or more of the tables  Or didn't re-cluster / re-index them?  It's somewhat of a leap of faith to rely on a week-old plan.

Do more with

Expert Office
Submit tech questions to Ask the Experts™ at any time to receive solutions, advice, and new ideas from leading industry professionals.

Start 7-Day Free Trial