gdemaria
asked on
Temp table vs Database table for performance?
I have a procedure with many steps and lots of complex queries that I am trying to make run faster.
Calculations are made over 100 years of data. The math performed on any particular year's data is based in part, on the results of calculations from the previous year (so it's sequencial).
Since there are several different scenarios run, the total number of rows of data is around 12,000. These data rows are summed and averaged into a few records which are used in a report. The 12,000 records are no longer used after the summary tables are populated.
So, my question is... would I gain a *significant* amount of improved performance if I used Temporary tables for the 12,000 records of data.. Currently, I am using regular database tables.
I am thinking this may be helpful, but before doing all the work, want to see what experts say. It is harder to debug with temp tables... I also use WITH CTE... to sum sequentially, any restrictions on using this with temp tables?
Calculations are made over 100 years of data. The math performed on any particular year's data is based in part, on the results of calculations from the previous year (so it's sequencial).
Since there are several different scenarios run, the total number of rows of data is around 12,000. These data rows are summed and averaged into a few records which are used in a report. The 12,000 records are no longer used after the summary tables are populated.
So, my question is... would I gain a *significant* amount of improved performance if I used Temporary tables for the 12,000 records of data.. Currently, I am using regular database tables.
I am thinking this may be helpful, but before doing all the work, want to see what experts say. It is harder to debug with temp tables... I also use WITH CTE... to sum sequentially, any restrictions on using this with temp tables?
"temp" as in # or @ ?
SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
For 12k, yes it would be significant to use #tmp with clustered key and indexes.
100 or so is when to stop using @ in memory tables.
Under rate circumstances (only) should you use regular tables for such a reporting function - they pollute the log and get involved unnecessarily log shipping, remote sync etc.
100 or so is when to stop using @ in memory tables.
Under rate circumstances (only) should you use regular tables for such a reporting function - they pollute the log and get involved unnecessarily log shipping, remote sync etc.
CyberKiwi,
Are you saying that using #tmp is going to be quicker than using his existing db tables which he has ?
Are you saying that using #tmp is going to be quicker than using his existing db tables which he has ?
ASKER
> "temp" as in # or @ ?
Let me rephrase and say, what do you think would be the best approach..
> Are you running Enterprise edition ?
No, standard
Could you briefly explain how/why SQL Partitioning or Archival would help in this situation?
> they pollute the log and get involved unnecessarily log shipping, remote sync etc.
I can see this. It actually takes longer to delete the old records than any other single function
So you recommend #temporary tables? No logging would be done with those?
Let me rephrase and say, what do you think would be the best approach..
> Are you running Enterprise edition ?
No, standard
Could you briefly explain how/why SQL Partitioning or Archival would help in this situation?
> they pollute the log and get involved unnecessarily log shipping, remote sync etc.
I can see this. It actually takes longer to delete the old records than any other single function
So you recommend #temporary tables? No logging would be done with those?
Hi there, if you know that 12,000 rows is never going to grow larger then lets say over 50,000 rows then I would be using a table variable @tbl as they are faster then temp tables #tbl when the row count is not too large.
Another approach might be to create a new database table and a SQL Server Job to populate this new table with calculated data that the report uses, so that you only have to the run the select to retrun the rows and not re-do calculation everytime.
Another approach might be to create a new database table and a SQL Server Job to populate this new table with calculated data that the report uses, so that you only have to the run the select to retrun the rows and not re-do calculation everytime.
ASKER CERTIFIED SOLUTION
membership
Create a free account to see this answer
Signing up is free and takes 30 seconds. No credit card required.
ASKER
I should add one more thing to the mix.
Mutliple users can run this report on different data at the same time or as often as they like.
This is not a one-time run every twenty four hours, it is an on-demand, while-you-wait complex report of information selected by each user that needs to run as quickly as possible and display the results.
Does that change anyone's recommended solution?
Mutliple users can run this report on different data at the same time or as often as they like.
This is not a one-time run every twenty four hours, it is an on-demand, while-you-wait complex report of information selected by each user that needs to run as quickly as possible and display the results.
Does that change anyone's recommended solution?
You could start looking at SSAS cubes and some daily aggregation of the data, with some recalculation of the active portion to merge into pre-aggregated data. It would negate the requirement to keep scanning all available data for an entire year.
Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table). The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to
SET transaction isolation level read uncommitted
to minimize locks on the tables.
That's all periphery, I still recommend using #temp tables and indexing the tables for fast access.
Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table). The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to
SET transaction isolation level read uncommitted
to minimize locks on the tables.
That's all periphery, I still recommend using #temp tables and indexing the tables for fast access.
If you really just trying to performance turn a sproc then you should be using # temp tables and adding teh appropiate nonClustered Indexes to that table and possibly oether tables that are used in the query. Look at the Joins you are using and create indexes based of those usually. Should be including the actual execution plan, and also use sql profiler to see your read and writes and exec time of the sproc.
ASKER
> You could start looking at SSAS cubes and some daily aggregation of the data, with some recalculation of the active portion to merge into pre-aggregated data. It would negate the requirement to keep scanning all available data for an entire year.
The characteristics of the data would not allow for this type of processing as the combinations are infinite. It has to be done on-demand once the user enters a series of criteria and parameters.
> Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table). The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to
This sounds good. I have a "Report_ID" which will partition the table by user's session (the user has to be logged in)
> then you should be using # temp tables and adding teh appropiate nonClustered Indexes
Sounds like everyone agrees #temp tables.
Thanks very much!
The characteristics of the data would not allow for this type of processing as the combinations are infinite. It has to be done on-demand once the user enters a series of criteria and parameters.
> Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table). The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to
This sounds good. I have a "Report_ID" which will partition the table by user's session (the user has to be logged in)
> then you should be using # temp tables and adding teh appropiate nonClustered Indexes
Sounds like everyone agrees #temp tables.
Thanks very much!
ASKER
Sorry, closing out some old questions