[Okta Webinar] Learn how to a build a cloud-first strategyRegister Now

x
?
Solved

Can anyone optimise this SQL please

Posted on 2007-10-17
10
Medium Priority
?
150 Views
Last Modified: 2010-03-20
Hi,
I am a newbie to coding T-SQL for SQL Server 2005  and have written the following code. However although it runs <1 second on a 200 record set it takes >11minutes (and counting) to run on a 100,000 set. My final set will be >11million  so i really need to optimise it.
Can you help please?

/////////////
explanation of code
I have a number of incidents per device. I need to run a sliding window of 1 hr over all incdients per device. I want to get back a result set of the 4th FK_AddressID 's involved in a set of 4 incidents in any given hour. I thus use 4 columns to track the previous 4 values so can do a difference between the first and last per record per device to see if <= 1hr
///////////////

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT FK_FaultHASCID AS IDOfEAC2_Fault

FROM
(
SELECT
      t4.StartTime AS StartTime4,
      t3.StartTime AS StartTime3,
      t2.StartTime AS StartTime2,
      t1.StartTime AS StartTime1,
      t1.FK_AddressID,
      t1.FK_FaultHASCID,
      DATEDIFF(ss,t4.StartTime, t1.StartTime) AS span,
      (CASE WHEN (DATEDIFF(ss,t4.StartTime, t1.StartTime) <=3600) THEN 1 ELSE 0 END) AS Type2_Outage
FROM Processing_FaultDurations t4
JOIN Processing_FaultDurations t3
on t4.StartTime<t3.StartTime
JOIN Processing_FaultDurations t2
on t3.StartTime< t2.StartTime
JOIN Processing_FaultDurations t1
on t2.StartTime < t1.StartTime
AND t1.Duration<300
AND t2.Duration<300
AND t3.Duration<300
AND t4.Duration<300
AND t1.FK_AddressID = t2.FK_AddressID
AND t2.FK_AddressID = t3.FK_AddressID
AND t3.FK_AddressID = t4.FK_AddressID
) t1
WHERE Type2_Outage =1
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID
0
Comment
Question by:sdom100
10 Comments
 
LVL 5

Assisted Solution

by:Crag
Crag earned 200 total points
ID: 20092134
Have you processed this statement using the DB Engine Tuning Advisor.
It may suggest some indexes which improve the performance.
0
 

Author Comment

by:sdom100
ID: 20092250
Thanks I'm trying it now.
Does the code itself look like a good approach ?
0
 
LVL 143

Assisted Solution

by:Guy Hengel [angelIII / a3]
Guy Hengel [angelIII / a3] earned 400 total points
ID: 20092342
not really, because:

JOIN Processing_FaultDurations t3
on t4.StartTime<t3.StartTime
JOIN Processing_FaultDurations t2
on t3.StartTime< t2.StartTime
JOIN Processing_FaultDurations t1
on t2.StartTime < t1.StartTime

is selfjoining with a cartesian product. I assume you want t4 to be the earliest, and t3...t1 to be the ones after that?
you need to modify somehow, starting with this:

FROM Processing_FaultDurations t4
JOIN Processing_FaultDurations t3
 on t3.FK_AddressID = t4.FK_AddressID
 and t4.StartTime < t3.StartTime
AND t3.Duration<300
JOIN Processing_FaultDurations t2
on  t2.FK_AddressID = t3.FK_AddressID
and t3.StartTime< t2.StartTime
AND t2.Duration<300
JOIN Processing_FaultDurations t1
on t1.FK_AddressID = t2.FK_AddressID
and t2.StartTime < t1.StartTime
AND t1.Duration<300
WHERE t4.Duration<300

0
NFR key for Veeam Backup for Microsoft Office 365

Veeam is happy to provide a free NFR license (for 1 year, up to 10 users). This license allows for the non‑production use of Veeam Backup for Microsoft Office 365 in your home lab without any feature limitations.

 
LVL 10

Assisted Solution

by:answer_me
answer_me earned 400 total points
ID: 20092354
Try this:
SELECT IDOfEAC2_Fault      ,FK_AddressID      ,StartTime      ,ClosedTime      ,EventATypeID2=2
FROM
(
      SELECT DISTINCT FK_FaultHASCID AS IDOfEAC2_Fault
      FROM
      (
            SELECT        t4.StartTime AS StartTime4,      
                        t3.StartTime AS StartTime3,      
                        t2.StartTime AS StartTime2,      
                        t1.StartTime AS StartTime1,      
                        t1.FK_AddressID,
                        t1.FK_FaultHASCID,      
                        DATEDIFF(ss,t4.StartTime, t1.StartTime) AS span,      
                        (CASE WHEN (DATEDIFF(ss,t4.StartTime, t1.StartTime) <=3600) THEN 1 ELSE 0 END) AS Type2_Outage
            FROM Processing_FaultDurations t4
            JOIN Processing_FaultDurations t3 on t3.FK_AddressID = t4.FK_AddressID AND t3.Duration<300
            AND t4.Duration<300 And t4.StartTime<t3.StartTime
            JOIN Processing_FaultDurations t2 on t2.FK_AddressID = t3.FK_AddressID And t2.Duration<300 And t3.StartTime< t2.StartTime
            JOIN Processing_FaultDurations t1 on t1.FK_AddressID = t2.FK_AddressID AND t1.Duration<300 And t2.StartTime < t1.StartTime            
      ) t1
      WHERE Type2_Outage =1
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID
0
 

Author Comment

by:sdom100
ID: 20092508
Thankyou to you both for your solutions.
The key difference between your solutions as i see it is that angelll has done
WHERE t4.Duration<300
whereas answer_me has put this criteria in the first join

Which is the best approach - does it matter?
ps. running answer_me's solution still is taking forever (3 mins and counting) on an 8000 row set in Processing_FaultDurations is this what you would expect ?

Machine = new 2x quad core, 4GB RAM, 76GB Raid drive for db

Thanks,
Shaun
0
 
LVL 5

Expert Comment

by:kemanetzis
ID: 20092578
how about u try the one below...
the one below helps mssql to create a temporary index that speeds things up.
i suggest you also do what Crag suggested to speed it even more..

if you want more details on what it does u can ask

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT t4.FK_FaultHASCID AS IDOfEAC2_Fault
FROM Processing_FaultDurations t1 JOIN Processing_FaultDurations t4 on t1.FK_AddressID=t4.FK_AddressID and DATEDIFF(ss,t4.StartTime, t1.StartTime) <=3600 and t1.duration<300 and t4.duration<300
  and (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID)=2
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID
0
 

Author Comment

by:sdom100
ID: 20093231
kemanetzis, thanks for your solution.
I can see how it is woriking but it returns (when run on a small test set) far more records than it should do.
As i've got some more fundamental sql problems to solve (see new post) i'm going to focus on these for the rest of the day and sort out optimisations tomorrow.

Thanks to all for your help, i'll dish up the points tomorrow to allow anyone else to pop in wtih comments.
S.
0
 
LVL 5

Expert Comment

by:kemanetzis
ID: 20094078
Thanks,
 
try also this
You will notice the correction of <=3600 to  >=-3600  so you get the 1 hour you want since i was getting more results than i should with the old check.

SELECT IDOfEAC2_Fault
      ,FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT t1.FK_FaultHASCID AS IDOfEAC2_Fault, t1.starttime as tt --, t4.starttime as tt4, (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID) as cc, DATEDIFF(ss,t4.StartTime, t1.StartTime) as dd
FROM
Processing_FaultDurations t1 JOIN Processing_FaultDurations t4
on t1.FK_AddressID=t4.FK_AddressID
and t4.StartTime>t1.StartTime
and (DATEDIFF(ss,t4.StartTime, t1.StartTime) >=-3600)
and t1.duration<300
and t4.duration<300
and (select count(FK_AddressID) from Processing_FaultDurations t where t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID)=2
) retrieval
join Processing_FaultDurations
on IDOfEAC2_Fault = Processing_FaultDurations.FK_FaultHASCID and tt = Processing_FaultDurations.starttime
0
 

Author Comment

by:sdom100
ID: 20094215
Thanks,
That now returns what I would expect and I can see how it could be more optimised than some of the other suggestions.

I have a follow on question (which i'm happy to post as a new question) - just say if you would prefer.
I need to (as another output) retrieve a list of all the FK_FaultHASCID 's that are involved in one of these result sets.
For example, your solutinos returns 2 rows
IDOfEAC2_Fault column with id of 7 where 7 is the fourth fault in a series of 4,5,6,7
and
IDOfEAC2_Fault column with id of 20 where 20 is the fourth fault in a series of 17,18,19,20
I would like to get back a result set of 4,5,6,7,17,18,19,20
- how would i do this ? (happy to post as new question - just say so)
0
 
LVL 5

Accepted Solution

by:
kemanetzis earned 1000 total points
ID: 20099181
the one below does what you ask in your last comment.
No need to make a second question.

SELECT DISTINCT PF.FK_FaultHASCID as IDOfEAC2_Fault
      ,PF.FK_AddressID
      ,StartTime
      ,ClosedTime
      ,EventATypeID2=2
FROM
(
SELECT DISTINCT t1.FK_AddressID, t1.starttime as tt1, t4.starttime as tt4
FROM Processing_FaultDurations t1 JOIN Processing_FaultDurations t4
ON t1.FK_AddressID=t4.FK_AddressID
AND t4.StartTime>t1.StartTime
AND (DATEDIFF(ss,t4.StartTime, t1.StartTime) >=-3600)
AND t1.duration<300
AND t4.duration<300
AND (SELECT count(FK_AddressID) FROM Processing_FaultDurations t WHERE t.StartTime>t1.StartTime and t.StartTime<t4.StartTime and t.duration<300 and t.FK_AddressID = t1.FK_AddressID)=2
) R
JOIN Processing_FaultDurations PF
ON R.FK_AddressID=PF.FK_AddressID AND PF.starttime>=R.tt1 AND PF.starttime<=R.tt4 AND PF.Duration<300

--ORDER BY PF.FK_AddressID, starttime
0

Featured Post

Free Backup Tool for VMware and Hyper-V

Restore full virtual machine or individual guest files from 19 common file systems directly from the backup file. Schedule VM backups with PowerShell scripts. Set desired time, lean back and let the script to notify you via email upon completion.  

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

One of the most important things in an application is the query performance. This article intends to give you good tips to improve the performance of your queries.
This shares a stored procedure to retrieve permissions for a given user on the current database or across all databases on a server.
Whether it be Exchange Server Crash Issues, Dirty Shutdown Errors or Failed to mount error, Stellar Phoenix Mailbox Exchange Recovery has always got your back. With the help of its easy to understand user interface and 3 simple steps recovery proced…
With just a little bit of  SQL and VBA, many doors open to cool things like synchronize a list box to display data relevant to other information on a form.  If you have never written code or looked at an SQL statement before, no problem! ...  give i…

872 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question