Link to home
Start Free TrialLog in
Avatar of sdom100
sdom100

asked on

Can anyone optimise this SQL please

Hi,
I am a newbie to coding T-SQL for SQL Server 2005  and have written the following code. However although it runs <1 second on a 200 record set it takes >11minutes (and counting) to run on a 100,000 set. My final set will be >11million  so i really need to optimise it.
Can you help please?

/////////////
explanation of code
I have a number of incidents per device. I need to run a sliding window of 1 hr over all incdients per device. I want to get back a result set of the 4th FK_AddressID 's involved in a set of 4 incidents in any given hour. I thus use 4 columns to track the previous 4 values so can do a difference between the first and last per record per device to see if <= 1hr
///////////////

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT FK_FaultHASCID AS IDOfEAC2_Fault

FROM
(
SELECT
      t4.StartTime AS StartTime4,
      t3.StartTime AS StartTime3,
      t2.StartTime AS StartTime2,
      t1.StartTime AS StartTime1,
      t1.FK_AddressID,
      t1.FK_FaultHASCID,
      DATEDIFF(ss,t4.StartTime, t1.StartTime) AS span,
      (CASE WHEN (DATEDIFF(ss,t4.StartTime, t1.StartTime) <=3600) THEN 1 ELSE 0 END) AS Type2_Outage
FROM Processing_FaultDurations t4
JOIN Processing_FaultDurations t3
on t4.StartTime<t3.StartTime
JOIN Processing_FaultDurations t2
on t3.StartTime< t2.StartTime
JOIN Processing_FaultDurations t1
on t2.StartTime < t1.StartTime
AND t1.Duration<300
AND t2.Duration<300
AND t3.Duration<300
AND t4.Duration<300
AND t1.FK_AddressID = t2.FK_AddressID
AND t2.FK_AddressID = t3.FK_AddressID
AND t3.FK_AddressID = t4.FK_AddressID
) t1
WHERE Type2_Outage =1
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID
SOLUTION
Avatar of Crag
Crag
Flag of United Kingdom of Great Britain and Northern Ireland image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sdom100
sdom100

ASKER

Thanks I'm trying it now.
Does the code itself look like a good approach ?
SOLUTION
Avatar of Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3]
Flag of Luxembourg image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of sdom100

ASKER

Thankyou to you both for your solutions.
The key difference between your solutions as i see it is that angelll has done
WHERE t4.Duration<300
whereas answer_me has put this criteria in the first join

Which is the best approach - does it matter?
ps. running answer_me's solution still is taking forever (3 mins and counting) on an 8000 row set in Processing_FaultDurations is this what you would expect ?

Machine = new 2x quad core, 4GB RAM, 76GB Raid drive for db

Thanks,
Shaun
how about u try the one below...
the one below helps mssql to create a temporary index that speeds things up.
i suggest you also do what Crag suggested to speed it even more..

if you want more details on what it does u can ask

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT t4.FK_FaultHASCID AS IDOfEAC2_Fault
FROM Processing_FaultDurations t1 JOIN Processing_FaultDurations t4 on t1.FK_AddressID=t4.FK_AddressID and DATEDIFF(ss,t4.StartTime, t1.StartTime) <=3600 and t1.duration<300 and t4.duration<300
  and (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID)=2
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID
Avatar of sdom100

ASKER

kemanetzis, thanks for your solution.
I can see how it is woriking but it returns (when run on a small test set) far more records than it should do.
As i've got some more fundamental sql problems to solve (see new post) i'm going to focus on these for the rest of the day and sort out optimisations tomorrow.

Thanks to all for your help, i'll dish up the points tomorrow to allow anyone else to pop in wtih comments.
S.
Thanks,
 
try also this
You will notice the correction of <=3600 to  >=-3600  so you get the 1 hour you want since i was getting more results than i should with the old check.

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT t1.FK_FaultHASCID AS IDOfEAC2_Fault, t1.starttime as tt --, t4.starttime as tt4, (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID) as cc, DATEDIFF(ss,t4.StartTime, t1.StartTime) as dd
FROM
Processing_FaultDurations t1 JOIN Processing_FaultDurations t4
on t1.FK_AddressID=t4.FK_AddressID
and t4.StartTime>t1.StartTime
and (DATEDIFF(ss,t4.StartTime, t1.StartTime) >=-3600)
and t1.duration<300
and t4.duration<300
and (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID)=2
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID and tt = Processing_FaultDurations.starttime
Avatar of sdom100

ASKER

Thanks,
That now returns what I would expect and I can see how it could be more optimised than some of the other suggestions.

I have a follow on question (which i'm happy to post as a new question) - just say if you would prefer.
I need to (as another output) retrieve a list of all the FK_FaultHASCID 's that are involved in one of these result sets.
For example, your solutinos returns 2 rows
IDOfEAC2_Fault column with id of 7 where 7 is the fourth fault in a series of 4,5,6,7
and
IDOfEAC2_Fault column with id of 20 where 20 is the fourth fault in a series of 17,18,19,20
I would like to get back a result set of 4,5,6,7,17,18,19,20
- how would i do this ? (happy to post as new question - just say so)
ASKER CERTIFIED SOLUTION
Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial