Potential Issue with UPDATE FROM??

I recall back in the SQL Server 2000 days, this was considered bad and inefficient coding. Does it still hold true:

UPDATE b
SET b.something = a.something
FROM table1 a
INNER JOIN table2 b
ON a.key = b.key

The issue is that the UPDATE statement is updating table2 but table1 is in the FROM clause, and it could cause issues with performance and should be written:

UPDATE b
SET b.something = a.something
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Could there be performance issues in newer versions (e.g. 2012 and up) with this syntax? If so, can someone explain how the inefficiencies show themselves? I've looked as some query plans going both ways and they seem identical.
LVL 15
Doug BishopDatabase DeveloperAsked:
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

x
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

Russ SuterSenior Software DeveloperCommented:
Short answer, your 2nd example is the better way to do it.

Longer answer, it depends a lot on how things are indexed. The first approach should work fine but you're updating across the join which tends to slow things down a bit. Plug both of those into SSMS and determine the execution plan for each. It's entirely possible that the current SQL optimizer routines can properly optimize both queries just as efficiently. I'm tempted to believe, however, that the 2nd update statement will somehow be more efficient.
Scott PletcherSenior DBACommented:
It shouldn't make any difference.  I would expect SQL to generate the same plan either way, particularly for this straight-forward a join.

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
pcelbaCommented:
This highly depends on number of rows in tables and also on the key type.
If both keys are PKs and the number of rows in both tables is approx. equal then there is no difference in timing but the execution plans will differ.  One table must be scanned row by row, rows from the 2nd one are seeked. If the number of rows is equal then SQL Server handles the JOIN in the order specified by the query author.

If the number of rows in the two tables will differ significantly but the keys are still PKs then SQL engine decides to scan the smaller table and to seek just a few rows from the large table. And the two execution plans are then equal.

If the keys are not PKs then it could result in multiple updates of one row which is better to avoid. I did not test this possibility but it would be good exercise...
Determine the Perfect Price for Your IT Services

Do you wonder if your IT business is truly profitable or if you should raise your prices? Learn how to calculate your overhead burden with our free interactive tool and use it to determine the right price for your IT services. Download your free eBook now!

Raja Jegan RSQL Server DBA & Architect, EE Solution GuideCommented:
For more clarity, lets dissect your query a little bit..
UPDATE b -- This specifies the table to be updated..
SET b.something = a.something -- Since the table to be updated is specified above, b.something is not required. just SET something = a.something is sufficient
-- This is the JOIN query to select what records are to be updated based upon the joins
FROM table2 b
INNER JOIN table1 a
ON a.key = b.key

Open in new window


with INNER JOIN, table1 INNER JOIN table2 or table2 INNER JOIN table1 will be treated the same. Same execution plan only will be generated..
SQL Server will generate the optimal execution plans based upon the no. of records in the tables along with latest statistics.
For OUTER JOIN, this might vary..

So, to answer your question, there won't be any performance difference between usage of table1 and table2 order for INNER JOIN after SQL Server 2005 onwards..
Doug BishopDatabase DeveloperAuthor Commented:
I did some testing in the environment I have available and it does not appear to make a difference. I like order, and to me it makes sense to have the target of the update in the FROM clause. I've seen queries where the table actually being updated is about the 5th one referenced in a query that has about 10 joins. VERY confusing.

It could be that the optimizer could select a different plan based on statistics, it isn't perfect. I've seen it pick a plan in production that had up-to-date statistics that totally SUCKED! By adding an index hint (shrug), the query went from almost 10 minutes to less than 45 seconds, consistently. A full table scan of tens-of-millions of rows was changed to an index seek. Phooey on SQL Server. :-)
pcelbaCommented:
Queries having 10 joins are always tricky... so the index hint or different query structure is the standard way how to optimize it.
Also the cardinality estimator changed in SQL 2014 and I would guess it is still changing with new versions. Some queries behave better under the new versions some don't. You may decide to use the old one or you may study execution plans and optimize it by adding hints which is OK in most of the cases.

I don't know what testing you did but the simple query from your question can produce different execution plans under certain conditions but it seems you don't believe.
Doug BishopDatabase DeveloperAuthor Commented:
I'm not saying I it can't produce different plans, I'm just saying in my testing environment it didn't. SQL Server is not perfect. Even an actual query plan is an estimate based on limited time to run all potential scenarios, the complexity of the query and the freshness of the statistics. If you are joining on 30 tables and each table has 10 indexes and there are millions of rows of data, the combinations that can be applied are overwhelming.
It would be nice if you could set an option to have SQL Server not limit resources it uses to estimate a plan. Have a production clone database  with limited activity on it so SQL Server can spend 30 minutes chugging through every conceivable combination and churn out a "true" compiled query plan that can be saved in the file system, not cached and thrown out as soon as space is needed.

When you have thousands of stored procedure running hourly, daily, weekly, monthly, quarterly, yearly, as well as on-demand Jobs against terabyte databases, what is the chance your weekly proc is still in cache?
Scott PletcherSenior DBACommented:
Almost none.  But a week old plan might not be that good either, no matter how long it took to create, particularly from SQL Server's viewpoint.

Who's to say you didn't add or delete 100Ks, millions (,billions?) of rows from one or more of the tables  Or didn't re-cluster / re-index them?  It's somewhat of a leap of faith to rely on a week-old plan.
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
SQL

From novice to tech pro — start learning today.