Avatar of D B
D B
Flag for United States of America asked on

Potential Issue with UPDATE FROM??

I recall back in the SQL Server 2000 days, this was considered bad and inefficient coding. Does it still hold true:

UPDATE b
SET b.something = a.something
FROM table1 a
INNER JOIN table2 b
ON a.key = b.key

The issue is that the UPDATE statement is updating table2 but table1 is in the FROM clause, and it could cause issues with performance and should be written:

UPDATE b
SET b.something = a.something
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Could there be performance issues in newer versions (e.g. 2012 and up) with this syntax? If so, can someone explain how the inefficiencies show themselves? I've looked as some query plans going both ways and they seem identical.
Microsoft SQL ServerSQL

Avatar of undefined
Last Comment
Scott Pletcher

8/22/2022 - Mon
Russ Suter

Short answer, your 2nd example is the better way to do it.

Longer answer, it depends a lot on how things are indexed. The first approach should work fine but you're updating across the join which tends to slow things down a bit. Plug both of those into SSMS and determine the execution plan for each. It's entirely possible that the current SQL optimizer routines can properly optimize both queries just as efficiently. I'm tempted to believe, however, that the 2nd update statement will somehow be more efficient.
ASKER CERTIFIED SOLUTION
Scott Pletcher

THIS SOLUTION ONLY AVAILABLE TO MEMBERS.
View this solution by signing up for a free trial.
Members can start a 7-Day free trial and enjoy unlimited access to the platform.
See Pricing Options
Start Free Trial
GET A PERSONALIZED SOLUTION
Ask your own question & get feedback from real experts
Find out why thousands trust the EE community with their toughest problems.
Pavel Celba

This highly depends on number of rows in tables and also on the key type.
If both keys are PKs and the number of rows in both tables is approx. equal then there is no difference in timing but the execution plans will differ.  One table must be scanned row by row, rows from the 2nd one are seeked. If the number of rows is equal then SQL Server handles the JOIN in the order specified by the query author.

If the number of rows in the two tables will differ significantly but the keys are still PKs then SQL engine decides to scan the smaller table and to seek just a few rows from the large table. And the two execution plans are then equal.

If the keys are not PKs then it could result in multiple updates of one row which is better to avoid. I did not test this possibility but it would be good exercise...
Raja Jegan R

For more clarity, lets dissect your query a little bit..
UPDATE b -- This specifies the table to be updated..
SET b.something = a.something -- Since the table to be updated is specified above, b.something is not required. just SET something = a.something is sufficient
-- This is the JOIN query to select what records are to be updated based upon the joins
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Open in new window


with INNER JOIN, table1 INNER JOIN table2 or table2 INNER JOIN table1 will be treated the same. Same execution plan only will be generated..
SQL Server will generate the optimal execution plans based upon the no. of records in the tables along with latest statistics.
For OUTER JOIN, this might vary..

So, to answer your question, there won't be any performance difference between usage of table1 and table2 order for INNER JOIN after SQL Server 2005 onwards..
Experts Exchange is like having an extremely knowledgeable team sitting and waiting for your call. Couldn't do my job half as well as I do without it!
James Murphy
D B

ASKER
I did some testing in the environment I have available and it does not appear to make a difference. I like order, and to me it makes sense to have the target of the update in the FROM clause. I've seen queries where the table actually being updated is about the 5th one referenced in a query that has about 10 joins. VERY confusing.

It could be that the optimizer could select a different plan based on statistics, it isn't perfect. I've seen it pick a plan in production that had up-to-date statistics that totally SUCKED! By adding an index hint (shrug), the query went from almost 10 minutes to less than 45 seconds, consistently. A full table scan of tens-of-millions of rows was changed to an index seek. Phooey on SQL Server. :-)
Pavel Celba

Queries having 10 joins are always tricky... so the index hint or different query structure is the standard way how to optimize it.
Also the cardinality estimator changed in SQL 2014 and I would guess it is still changing with new versions. Some queries behave better under the new versions some don't. You may decide to use the old one or you may study execution plans and optimize it by adding hints which is OK in most of the cases.

I don't know what testing you did but the simple query from your question can produce different execution plans under certain conditions but it seems you don't believe.
D B

ASKER
I'm not saying I it can't produce different plans, I'm just saying in my testing environment it didn't. SQL Server is not perfect. Even an actual query plan is an estimate based on limited time to run all potential scenarios, the complexity of the query and the freshness of the statistics. If you are joining on 30 tables and each table has 10 indexes and there are millions of rows of data, the combinations that can be applied are overwhelming.
It would be nice if you could set an option to have SQL Server not limit resources it uses to estimate a plan. Have a production clone database  with limited activity on it so SQL Server can spend 30 minutes chugging through every conceivable combination and churn out a "true" compiled query plan that can be saved in the file system, not cached and thrown out as soon as space is needed.

When you have thousands of stored procedure running hourly, daily, weekly, monthly, quarterly, yearly, as well as on-demand Jobs against terabyte databases, what is the chance your weekly proc is still in cache?
⚡ FREE TRIAL OFFER
Try out a week of full access for free.
Find out why thousands trust the EE community with their toughest problems.
Scott Pletcher

Almost none.  But a week old plan might not be that good either, no matter how long it took to create, particularly from SQL Server's viewpoint.

Who's to say you didn't add or delete 100Ks, millions (,billions?) of rows from one or more of the tables  Or didn't re-cluster / re-index them?  It's somewhat of a leap of faith to rely on a week-old plan.