Link to home
Start Free TrialLog in
Avatar of jondanger
jondanger

asked on

A complicated, dynamic grouping, intensive TSQL query, where performance is very important.

Hi Experts,

I need to write a dynamic query that'll take some input variables and return a recordset.

I'm finding this immensely difficult, the few successes i've had, have been ridiculously inefficient when run against 1,000,000 rows of data.

I've tried building customDateAdd and customDateDiff functions, but the nature of the query makes them just too slow.

I really need some fresh eyes, from the real experts!

I've embedded some comments in the example code that'll hopefully explain things further, if anyone needs any clarification on anything, please let me know!

Many thanks
Jon
/* i need a SP that takes the following parmeters */
declare @First_Interval DateTime
declare @Last_Interval DateTime
declare @Intervals_To_Fetch Int
declare @Mins_Per_Interval Int
declare @Offset_From_Now Int
declare @Day_Start_Time DateTime
declare @Day_End_Time DateTime
 
/* Set some test Vars */
select @First_Interval = '2008-10-01T14:15:00.000', @Last_Interval  = '2008-10-06T15:45:00.000', @Mins_Per_Interval = 15, @Day_Start_Time = '1900-01-01T14:00:00.000', @Day_End_Time = '1900-01-01T16:00:00.000'
 
 
/* Create the Test table; jtable */
 
IF  NOT EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[jtable7]') AND type in (N'U'))
BEGIN
CREATE TABLE [dbo].[jtable7]([jtime] [datetime] NOT NULL,[b] [smallint] NULL) ON [PRIMARY]
END
DELETE FROM jtable7
INSERT INTO jtable7  (jtime, b) 
/*||||||||||||-------------------------|||||-----|||||||||||||||||
||||||||||||||       jtime             |||||  b  |||||||||||||||||
||||||||||||||_________________________|||||_____|||||||||||||||*/
SELECT/*||||*/'2008-10-01T14:15:00.000',/**/  6  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-01T14:45:00.000',/**/  2  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-01T14:46:59.000',/**/  4  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-01T14:52:55.000',/**/  1  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-01T14:55:59.000',/**/  3  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-03T14:00:05.000',/**/  5  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-03T14:01:05.000',/**/  7  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-03T15:37:08.000',/**/ -5  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-03T15:43:08.000',/**/ -4  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-06T14:09:05.000',/**/  0  /*||||*/UNION ALL
/*|||||||||||/ The next values are rogue-        /||||||||||||||||
|||||||||||||/ values for testing purposes...    /||||||||||||||||
|||||||||||||/ They should be IGNORED by the     /||||||||||||||||
|||||||||||||/ query because they are either     /||||||||||||||||
|||||||||||||/   1) < @Day_Start_Time            /||||||||||||||||
|||||||||||||/   2) > @Day_End_Time              /||||||||||||||||
|||||||||||||/   3) < @First_Interval            /||||||||||||||||
|||||||||||||/   4) > @Last_Interval             /||||||||||||||||
|||||||||||||/   5) Outside mon-fri              /||||||||||||||*/
SELECT/*||||*/'2008-10-01T14:01:03.000',/**/ 15  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-03T13:59:59.000',/**/ 99  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-04T14:01:05.000',/**/ 98  /*||||*/UNION ALL
SELECT/*||||*/'2008-10-06T16:09:05.000',/**/ 97  /*||||*/
 
 
/* jtable7 has been created and we have inserted some test data */
 
/* I need the following recordset returned when the input variables are:
 
@First_Interval   =  '2008-10-01T14:15:00.000'
@Last_Interval    =  '2008-10-06T15:45:00.000'
@Mins_Per_Interval=  15
@Day_Start_Time   =  '1900-01-01T14:00:00.000'
@Day_End_Time     =  '1900-01-01T16:00:00.000'
 
____________________________________________________
||Interval                 ||  O ||  H ||  L ||  C ||
---------------------------||----||----||----||----|| /__There's no 14:00 as 
||2008-10-01 14:15:00.000  ||  6 ||  6 ||  6 ||  6 || \  @First_Interval is 14:15,
||2008-10-01 14:30:00.000  ||NULL||NULL||NULL||NULL||    
||2008-10-01 14:45:00.000  ||  2 ||  4 ||  1 ||  3 ||  
||2008-10-01 15:00:00.000  ||NULL||NULL||NULL||NULL||  <-- Wendesday
||2008-10-01 15:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-01 15:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-01 15:45:00.000  ||NULL||NULL||NULL||NULL||  
 
||2008-10-02 14:00:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 14:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 14:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 14:45:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 15:00:00.000  ||NULL||NULL||NULL||NULL||  <-- Thurssday
||2008-10-02 15:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 15:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-02 15:45:00.000  ||NULL||NULL||NULL||NULL||  
 
||2008-10-03 14:00:00.000  ||  5 ||  7 ||  5 ||  7 ||  
||2008-10-03 14:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-03 14:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-03 14:45:00.000  ||NULL||NULL||NULL||NULL||  <-- Friday
||2008-10-03 15:00:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-03 15:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-03 15:30:00.000  || -5 || -5 || -4 || -4 ||  
||2008-10-03 15:45:00.000  ||NULL||NULL||NULL||NULL||
 
||2008-10-06 14:00:00.000  ||  0 ||  0 ||  0 ||  0 ||  
||2008-10-06 14:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-06 14:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-06 14:45:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-06 15:00:00.000  ||NULL||NULL||NULL||NULL||  <-- Monday
||2008-10-06 15:15:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-06 15:30:00.000  ||NULL||NULL||NULL||NULL||  
||2008-10-06 15:45:00.000  ||NULL||NULL||NULL||NULL*/  

Open in new window

Avatar of Leo Eikelman
Leo Eikelman

I see you are using Sql Server 2005.  You should use the SQL Server 2005 tuning advisor utility against your SQL statement(s).  It usually gives you good advice on what changes need to be done to speed up the query.

You can find a tutorial here:

http://www.developer.com/db/article.php/3607186


Thank you.
Avatar of jondanger

ASKER

Thanks leikelman, i only have 2k5 Express, but i'll be sure to check it out.

In any case, jtable7 is very simple and the index is obvious. I believe the problems lives in the code i've been trying to write. (which still doesn't perform how i'd hoped)


ASKER CERTIFIED SOLUTION
Avatar of Leo Eikelman
Leo Eikelman

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Avatar of Mark Wills
Mark Wills
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Have you tried something like :


declare @First_Interval DateTime
declare @Last_Interval DateTime
declare @Intervals_To_Fetch Int
declare @Mins_Per_Interval Int
declare @Offset_From_Now Int
declare @Day_Start_Time varchar(8)
declare @Day_End_Time varchar(8)
declare @End_Time DateTime
 
set @First_Interval   =  '2008-10-01T14:15:00.000'
set @Last_Interval    =  '2008-10-06T15:45:00.000'
set @Mins_Per_Interval=  15
set @Day_Start_Time   =  '14:15:00'
set @Day_End_Time     =  '15:45:00'
 
declare @dates table (start_date datetime, end_date datetime)
 
while @First_Interval <= @last_interval
begin
      set @end_time = dateadd(mi,@mins_per_interval,@first_interval)
      if convert(varchar(8),@first_interval,108) between @day_start_time and @day_end_time
      begin
         insert @dates values (@first_interval,@end_time)
         set @first_interval = @end_time
      end
      else
      begin
         set @first_interval = convert(datetime, (convert(varchar,dateadd(dd,1,@first_interval),106) + ' ' + @day_start_time),106)
      end
 
end
 
select start_date, (select top 1 b from jtable7 where jtime between start_date and end_date order by jtime asc) as O,max(b) as H, min(b) as L, (select top 1 b from jtable7 where jtime between start_date and end_date order by jtime desc) as C
from @dates 
inner join jtable7 on jtime between start_date and end_date 
group by start_date, end_date

Open in new window

Need to add some more qualifiers to that such as an outer join so we have all the date times, and a check for week day etc etc. but though it might server as a good discussion point...

Hi mark_wills, yes you're correct, o = open, h = highest, l = lowest, c = closing.

When you say build a "calendar" table, do you mean build it as a temporary table upon each query execution for joining?  Sounds like a good idea, i'll investigate further.

>>What kind of date ranges (ie start and end) are likely to be involved ?
Typically, 40 intervals will be queried for, however it could be more than this, and never more than 100.
The number of intervals returned, can be calculated from First_Interval and Last_Interval / @Mins_Per_Interval, this could span any number of minutes worth of b data.
Again, typical use is, 15 min intervals looking at 2 days worth of data.
another example is 720 min intervals looking at 80 days worth of data.

Have a look at the sample code above. it builds a memory table, then uses it.