SQL query performance in high traffic environment

What is the best approach in high traffic environments with millions of records:

1) Make your query use a temp table to SELECT a filtered down subset of data and insert it into your temp table. This way you can now work with your temp table on a much smaller data set.

2) Avoid using temp tables because INSERTs are so expensive.
Who is Participating?

[Product update] Infrastructure Analysis Tool is now available with Business Accounts.Learn More

I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

lcohanDatabase AnalystCommented:
In my opinion you would need to work with a tempdb only if you need to get a subset of data from (very)large tables as you said AND after that you need to modify this subset in the stored procedure before returning a data set.
if you need to gather multiple data sets AND then aggregate them instead of directly aggregating massive data.

The important thing is to make sure your tempdb is "optimized" and please see links below to help you do that and keep in mind to NOT mix #temp tables and @tables(variable) tables in the same SQL stored procedure as the result may be (most of the time for me) disastrous from performance point of view - only #tempdb works fine.
Also another benefit of #tempdb tables is that they can hold large volumes (millions of rows as you said) and you can add indexes on #temp tables to be further used in JOIN's/Aggregations.

Use SQL SSMS "Estimated query plan" or "Actual query plan" under query options to find any missing indexes and performance issues with the sql code.


Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
Scott PletcherSenior DBACommented:
By far the first most important thing is to make sure all tables have:

(1) the best clustered index, as indicated by missing index usage and index usage stats, and your own knowledge of table usage

(2) other nonclustered index(es), as indicated by missing index usage and index usage stats

After that, you can look at specific low-performing queries and determine how to tune them specifically
pzozulkaAuthor Commented:
Scott, my question was more in terms of future queries' architecture decisions, and whether it's better to work with smaller aggregates (#temp tables) or with the "real" tables directly to avoid expensive inserts.

Granted, having good indexes will improve performance no matter what. But I think that's besides the point and doesn't address my question (query architecture).
10 Tips to Protect Your Business from Ransomware

Did you know that ransomware is the most widespread, destructive malware in the world today? It accounts for 39% of all security breaches, with ransomware gangsters projected to make $11.5B in profits from online extortion by 2019.

Scott PletcherSenior DBACommented:
OK, as you wish.  But without that foundation, your q is misguided, because you can't solve performance problems (except with bigger and bigger hardware, which is a half-solution at best).  It's like trying to improve gas mileage while hauling a half-ton boulder.  Good luck with it, though.
lcohanDatabase AnalystCommented:
" future queries' architecture decisions" should be made based on needs right?
If you do not need to modify/aggregate subsets of data in a SQL code object you can use CTE or subqueries just make sure to filter your data as much as possible aside of the good indexes - use query plans to make sure there are no huge intermediate recored sets - eliminate all those "thick" lines.

If you need the opposite do not be afraid to use them - for instance in some large and longer running jobs I use tens even hundreds of millions of rows in temp tables for that matter but very "short" records 1 maybe 2 columns - Id and something else at the most - just make sure you optimize your tempdb.
pzozulkaAuthor Commented:
I'm confused. While indexes are likely the MOST import part of performance, isn't a crappy query the other part?

Also, I have no control over the indexes in our "real" tables. I have to work under the assumption that they are correct and optimized. The only indexes I have control over are the ones I can create on my temp tables (if I choose to go that route).
pzozulkaAuthor Commented:
lcohan, are CTE the same as #temp tables in terms of performance? I always thought that CTEs are weaker in performance primarily because you can't assign a clustered index on them, and no primary keys involved. I also try to avoid using subqueries as most of the time the subqueries I require tend to be "correlated" subqueries -- which are very slow from my understanding.
Scott PletcherSenior DBACommented:
Yes, a poorly-written query can hinder performance.  But that can be easily corrected with some basic rules for writing queries.

Otoh, poor indexing cannot be overcome not matter how well, or cleverly, you write any query.
lcohanDatabase AnalystCommented:
The CTE are not "weaker in performance primarily because you can't assign a clustered index on them" and you don't need PKeys on them either in my opinion.
Sub-queries are sometimes actually faster but it all depends how well your data is organized/indexed and code is written.

You can't just "poor butter over entire bread" and say ALL #tempdb tables or NONE #tempdb tables in my databases architecture. Aside that..compilation/recompilation may be a factor using #temptables, ETC....lost of things to consider but...as Scott said the question is somewhat misguided because #temptables and "high traffic environments with millions of records" do not INCLUDE or EXCLUDE each other - good luck!
Scott PletcherSenior DBACommented:
For certain, uncommon types of complex tasks, particularly with multiple joins or lookups, temp tables could still improve overall performance.

But a derived table or some type of subquery could avoid use of temp tables, and are preferable if they can, because the overhead of additional INSERTs is avoided.  Getting a query to efficiently produce the desired result set, either as a derived table or as a subquery, almost always depends on the indexing available on the underlying tables.
Can you be a bit more clear on what the uses are/will be?

Are you needing a roll up of data for reporting/ad hoc querying?  Is it too slow running against the live data?
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.