Temp table vs Database table for performance?

I have a procedure with many steps and lots of complex queries that I am trying to make run faster.

Calculations are made over 100 years of data. The math performed on any particular year's data is based in part, on the results of calculations from the previous year (so it's sequencial).

Since there are several different scenarios run, the total number of rows of data is around 12,000.  These data rows are summed and averaged into a few records which are used in a report.  The 12,000 records are no longer used after the summary tables are populated.

So, my question is... would I gain a *significant* amount of improved performance if I used Temporary tables for the 12,000 records of data..   Currently, I am using regular database tables.

I am thinking this may be helpful, but before doing all the work, want to see what experts say.  It is harder to debug with temp tables...    I also use WITH CTE... to sum sequentially, any restrictions on using this with temp tables?



LVL 39
gdemariaAsked:
Who is Participating?
 
cyberkiwiConnect With a Mentor Commented:
> So you recommend #temporary tables?   No logging would be done with those?
There will be logging, but only in the tempdb.ldf file not your regular db.  tempdb uses simple logging and is lighter than bulk/full logging.  tempdb is not involved log shipping or replication.

This is how I read the question:
The report is crunching all data into about 12,000 records, which are further summarized into a few records for a report.
The temp storage is related to those 12,000 records.

Table variables can work, as long as you add a clustered index in the definition, e.g.

declare @t table (id int clustered, col1 varchar(10))

Table variables are backed by physical disk when large amounts of data are used, but other than the unique/clustered key, you cannot create indexes on them which is a big minus if the table is reused multiple times and indexes can help.
0
 
cyberkiwiCommented:
"temp" as in # or @ ?
0
 
HumpdyConnect With a Mentor Commented:
You're better off using db tables,
but this sounds like a classic case of where you need to use SQL Partitioning.

Are you running Enterprise edition ?

If you can't do partitioning because you're on standard edition, then I would look into archival, especially if data is not used after population.
0
Cloud Class® Course: Microsoft Office 2010

This course will introduce you to the interfaces and features of Microsoft Office 2010 Word, Excel, PowerPoint, Outlook, and Access. You will learn about the features that are shared between all products in the Office suite, as well as the new features that are product specific.

 
cyberkiwiCommented:
For 12k, yes it would be significant to use #tmp with clustered key and indexes.
100 or so is when to stop using @ in memory tables.

Under rate circumstances (only) should you use regular tables for such a reporting function - they pollute the log and get involved unnecessarily log shipping, remote sync etc.
0
 
HumpdyCommented:
CyberKiwi,
Are you saying that using #tmp is going to be quicker than using his existing db tables which he has ?
0
 
gdemariaAuthor Commented:
> "temp" as in # or @ ?

Let me rephrase and say, what do you think would be the best approach..

> Are you running Enterprise edition ?

No, standard

Could you briefly explain how/why SQL Partitioning or Archival would help in this situation?

> they pollute the log and get involved unnecessarily log shipping, remote sync etc.

I can see this.  It actually takes longer to delete the old records than any other single function

So you recommend #temporary tables?   No logging would be done with those?
0
 
BirdbusterCommented:
Hi there, if you know that 12,000 rows is never going to grow larger then lets say over 50,000 rows then I would be using a table variable @tbl as they are faster then temp tables #tbl when the row count is not too large.
Another approach might be to create a new database table and a SQL Server Job to populate this new table with calculated data that the report uses, so that you only have to the run the select to retrun the rows and not re-do calculation everytime.
0
 
gdemariaAuthor Commented:
I should add one more thing to the mix.

Mutliple users can run this report on different data at the same time or as often as they like.  

This is not a one-time run every twenty four hours, it is an on-demand, while-you-wait complex report of information selected by each user that needs to run as quickly as possible and display the results.

Does that change anyone's recommended solution?
0
 
cyberkiwiCommented:
You could start looking at SSAS cubes and some daily aggregation of the data, with some recalculation of the active portion to merge into pre-aggregated data.  It would negate the requirement to keep scanning all available data for an entire year.

Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table).  The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to

SET transaction isolation level read uncommitted

to minimize locks on the tables.

That's all periphery, I still recommend using #temp tables and indexing the tables for fast access.
0
 
BirdbusterCommented:
If you really just trying to performance turn a sproc then you should be using # temp tables and adding teh appropiate nonClustered Indexes to that table and possibly oether tables that are used in the query. Look at the Joins you are using and create indexes based of those usually. Should be including the actual execution plan, and also use sql profiler to see your read and writes and exec time of the sproc.
0
 
gdemariaAuthor Commented:
> You could start looking at SSAS cubes and some daily aggregation of the data, with some recalculation of the active portion to merge into pre-aggregated data.  It would negate the requirement to keep scanning all available data for an entire year.

The characteristics of the data would not allow for this type of processing as the combinations are infinite.   It has to be done on-demand once the user enters a series of criteria and parameters.

> Having said that, as you currently have it, it makes it even less desirable to use normal in-db tables, since you have to manage multi-user access (you could user spid as a key to partition that table).  The #/@ tables are session specific and won't have multi-user issues, but the select queries themselves should probably be using a session set to

This sounds good.  I have a "Report_ID" which will partition the table by user's session (the user has to be logged in)

> then you should be using # temp tables and adding teh appropiate nonClustered Indexes

Sounds like everyone agrees #temp tables.


Thanks very much!



0
 
gdemariaAuthor Commented:
Sorry, closing out some old questions
0
Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.

All Courses

From novice to tech pro — start learning today.