Link to home
Start Free TrialLog in
Avatar of NickUA
NickUA

asked on

Temp Tables, Derived Tables, Table Joins, Inline Tables - Performance Question

I have a large table named tbl_df_weekly with about 100 million rows in it...  of which there is a field called "vid" that is an integer representing a vendor id or particular subset of data... so each one of our customers has a VID that identifies that row of data as associated iwth them - so 99% of our queries have where vid = x and etc...  We also use a ton of functions to generate custom tables and lists that we then query against... we're beginning to run into some performance issues, and we've started to rewrite some of our queries...  a lot of our functions return inline tables, and some are multi-line table functions...  whenever we join against an inline table, everything is very fast, if we have a function that returns a table, but is derived from using a multi-line function, the join is painstakingly slow at times...  Also, we've started trying #temp tables at the top of our queries, but our major hold up there is that we can't use temp tables in functions?!?!  Below i've pasted 3 queries, all of which do the same thing, but have different execution plans, and different run times...  Can someone explain the one that *should* run the fastest and why?  Our database isn't big enough for any huge time differences...  I'm also considering horizontal partitioning the large table based on the VID column, would this speed up these queries?  Also, what type of index should i put on the table?  Right now i have a Clustered Index on VID, WMWeek...  however most of our queries also use (in the select statement) fields: pq, ps, and sometimes oh, oo, it  ....   all of which are integer  fields relating to sales data...  

QUERY 1
----------------------------------------------------

declare @stores table
(
      sid int
)
insert into @stores
select sid from func_GenerateStoreList(0,0)

declare @inums table
(
      inum int
)
insert into @inums
select inum from func_GetItemList(2,0,0)

declare @wmweeks table
(
      wmweek int
)
insert into @wmweeks
select wmweek from func_GetWmWeekList(0,200603,0,0)

select top 10
      round(dbo.func_DivByZero(df.ps, df.pq),2) as avg$,
      sum(df.pq) as pq,
      sum(df.ps) as ps,
      count(df.pq) as storecount,
      RANK() over (order by sum(df.pq) desc) as pqrank,
      RANK() over (order by sum(df.ps) desc) as psrank
from
      tbl_df_weekly df,
      @stores s,
      @inums i,
      @wmweeks w
where
      df.vid = 2 and df.wmweek = w.wmweek and df.sid = s.sid  and df.inum = i.inum
group by
      round(dbo.func_DivByZero(df.ps, df.pq),2)
order by
      avg$ desc


----------------------
QUERY 2
---------------------

select top 10
      round(dbo.func_DivByZero(df.ps, df.pq),2) as avg$,
      sum(df.pq) as pq,
      sum(df.ps) as ps,
      count(df.pq) as storecount
from
      tbl_df_weekly df,
      func_GenerateStoreList(0,0) s,
      func_GetItemList(2,0,0) i,
      func_GetWmWeekList(0,200603,0,0) w
where
      df.vid = 2 and df.wmweek = w.wmweek and df.sid = s.sid  and df.inum = i.inum
group by
      round(dbo.func_DivByZero(df.ps, df.pq),2)
order by
      avg$ desc

---------------
QUERY 3
--------------------

select sid into #stores from (select * from func_GenerateStoreList(0,0)) as tbl
select inum into #inums from (Select * from func_GetItemList(2,0,0)) as tbl
select wmweek into #wmweeks from (Select * from func_GetWmWeekList(0,200603,0,0)) as tbl


select top 10
      round(dbo.func_DivByZero(df.ps, df.pq),2) as avg$,
      sum(df.pq) as pq,
      sum(df.ps) as ps,
      count(df.pq) as storecount
from
      tbl_df_weekly df,
      #stores s,
      #inums i,
      #wmweeks w
where
      df.vid = 2 and df.wmweek = w.wmweek and df.sid = s.sid  and df.inum = i.inum
group by
      round(dbo.func_DivByZero(df.ps, df.pq),2)
order by
      avg$ desc



--------------------

of the above 3 queries, query 3 appears to be running the fastest...  Query 2 appears to be the slowest, and Query 1 is all over the board, sometimes fast, sometimes slow...  The slow part appears to be on the JOIN...  tbl_df_weekly in this case returns about 500,000 rows...   func_GetStorelist about 3000 rows, func_getitemlist about 300 rows, func_getwmweeklist about 2 rows ...  

Again, i can't use #temp tables in UDFs, however i can use @temp tables, what is the theoretical perfomance difference/advantage between the # and @ type tables?  aren't they both stored in memory therefor should be pretty fast?  Also, why would the 2nd query that treats the UDFs as basically inline tables take the longest by far???  Shouldn't sql query it once and then do the join on it?  or is it doing it for every row?

SQL Server 2005.

Thanks,
Nick
ASKER CERTIFIED SOLUTION
Avatar of Aneesh
Aneesh
Flag of Canada image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
one correction
> You can't create indexes on table vars

You can't create non clustured indexes on table variables
Avatar of NickUA
NickUA

ASKER

how do i create an index on a table variable on the fly?
You can create a CI, by specifying the desired column as Primary Key

declare @tab Table (i int Primary Key, j int )