pzozulka
asked on
SQL query performance in high traffic environment
What is the best approach in high traffic environments with millions of records:
1) Make your query use a temp table to SELECT a filtered down subset of data and insert it into your temp table. This way you can now work with your temp table on a much smaller data set.
2) Avoid using temp tables because INSERTs are so expensive.
1) Make your query use a temp table to SELECT a filtered down subset of data and insert it into your temp table. This way you can now work with your temp table on a much smaller data set.
2) Avoid using temp tables because INSERTs are so expensive.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
OK, as you wish. But without that foundation, your q is misguided, because you can't solve performance problems (except with bigger and bigger hardware, which is a half-solution at best). It's like trying to improve gas mileage while hauling a half-ton boulder. Good luck with it, though.
" future queries' architecture decisions" should be made based on needs right?
If you do not need to modify/aggregate subsets of data in a SQL code object you can use CTE or subqueries just make sure to filter your data as much as possible aside of the good indexes - use query plans to make sure there are no huge intermediate recored sets - eliminate all those "thick" lines.
If you need the opposite do not be afraid to use them - for instance in some large and longer running jobs I use tens even hundreds of millions of rows in temp tables for that matter but very "short" records 1 maybe 2 columns - Id and something else at the most - just make sure you optimize your tempdb.
If you do not need to modify/aggregate subsets of data in a SQL code object you can use CTE or subqueries just make sure to filter your data as much as possible aside of the good indexes - use query plans to make sure there are no huge intermediate recored sets - eliminate all those "thick" lines.
If you need the opposite do not be afraid to use them - for instance in some large and longer running jobs I use tens even hundreds of millions of rows in temp tables for that matter but very "short" records 1 maybe 2 columns - Id and something else at the most - just make sure you optimize your tempdb.
ASKER
I'm confused. While indexes are likely the MOST import part of performance, isn't a crappy query the other part?
Also, I have no control over the indexes in our "real" tables. I have to work under the assumption that they are correct and optimized. The only indexes I have control over are the ones I can create on my temp tables (if I choose to go that route).
Also, I have no control over the indexes in our "real" tables. I have to work under the assumption that they are correct and optimized. The only indexes I have control over are the ones I can create on my temp tables (if I choose to go that route).
ASKER
lcohan, are CTE the same as #temp tables in terms of performance? I always thought that CTEs are weaker in performance primarily because you can't assign a clustered index on them, and no primary keys involved. I also try to avoid using subqueries as most of the time the subqueries I require tend to be "correlated" subqueries -- which are very slow from my understanding.
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
The CTE are not "weaker in performance primarily because you can't assign a clustered index on them" and you don't need PKeys on them either in my opinion.
Sub-queries are sometimes actually faster but it all depends how well your data is organized/indexed and code is written.
You can't just "poor butter over entire bread" and say ALL #tempdb tables or NONE #tempdb tables in my databases architecture. Aside that..compilation/recompil ation may be a factor using #temptables, ETC....lost of things to consider but...as Scott said the question is somewhat misguided because #temptables and "high traffic environments with millions of records" do not INCLUDE or EXCLUDE each other - good luck!
Sub-queries are sometimes actually faster but it all depends how well your data is organized/indexed and code is written.
You can't just "poor butter over entire bread" and say ALL #tempdb tables or NONE #tempdb tables in my databases architecture. Aside that..compilation/recompil
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Can you be a bit more clear on what the uses are/will be?
Are you needing a roll up of data for reporting/ad hoc querying? Is it too slow running against the live data?
Are you needing a roll up of data for reporting/ad hoc querying? Is it too slow running against the live data?
ASKER
Granted, having good indexes will improve performance no matter what. But I think that's besides the point and doesn't address my question (query architecture).