dannygonzalez09
asked on
Roll up data using the time difference between contiguous rows
Hi
I am trying to find the time difference between contiguous rows and if its with in a certain interval roll them up as 1 outage
In the below example RowId time diff between Endtime of RowId -135 and StartTime of RowId -136 is greater than 10 mins and same between RowId - 136 & 137.. so they'll be treated as separate outages but time difference between the contiguous rows from 137 upto 157 is with 10 mins, in this case roll it up ...the result should be as below
There are may other columns involved... i'm hoping that i'll be able to get the rest of the issue resolved if i can roll these up together ...
So far, I've tried to use CTE with RowNum function to join the consecutive rows and get the time diff but it isn't working out..
For Ex:
RowId Id StartTime EndTime
135 8526 9/1/2013 17:35:07 9/1/2013 17:38:36
136 8526 9/1/2013 18:47:18 9/1/2013 18:49:07
137 8526 9/2/2013 14:08:12 9/2/2013 14:08:15
138 8526 9/2/2013 14:10:33 9/2/2013 14:10:44
139 8526 9/2/2013 14:12:25 9/2/2013 14:12:34
140 8526 9/2/2013 14:12:38 9/2/2013 14:13:00
141 8526 9/2/2013 14:13:11 9/2/2013 14:13:12
142 8526 9/2/2013 14:13:13 9/2/2013 14:13:15
143 8526 9/2/2013 14:13:31 9/2/2013 14:14:37
144 8526 9/2/2013 14:14:44 9/2/2013 14:15:14
145 8526 9/2/2013 14:15:18 9/2/2013 14:15:30
146 8526 9/2/2013 14:15:36 9/2/2013 14:15:43
147 8526 9/2/2013 14:15:48 9/2/2013 14:16:02
148 8526 9/2/2013 14:16:06 9/2/2013 14:16:15
149 8526 9/2/2013 14:18:53 9/2/2013 14:18:55
150 8526 9/2/2013 14:19:04 9/2/2013 14:19:07
151 8526 9/2/2013 14:19:15 9/2/2013 14:19:20
152 8526 9/2/2013 14:19:24 9/2/2013 14:19:25
153 8526 9/2/2013 14:19:36 9/2/2013 14:19:49
154 8526 9/2/2013 14:19:54 9/2/2013 14:19:58
155 8526 9/2/2013 14:20:02 9/2/2013 14:20:21
156 8526 9/2/2013 14:20:33 9/2/2013 14:20:42
157 8526 9/2/2013 14:27:13 9/2/2013 14:27:22
158 8526 9/2/2013 16:34:10 9/2/2013 16:35:23
Output:
135 8526 9/1/2013 17:35:07 9/1/2013 17:38:36
136 8526 9/1/2013 18:47:18 9/1/2013 18:49:07
137 8526 9/2/2013 14:08:12 9/2/2013 14:27:22
158 8526 9/2/2013 16:34:10 9/2/2013 16:35:23
I am trying to find the time difference between contiguous rows and if its with in a certain interval roll them up as 1 outage
In the below example RowId time diff between Endtime of RowId -135 and StartTime of RowId -136 is greater than 10 mins and same between RowId - 136 & 137.. so they'll be treated as separate outages but time difference between the contiguous rows from 137 upto 157 is with 10 mins, in this case roll it up ...the result should be as below
There are may other columns involved... i'm hoping that i'll be able to get the rest of the issue resolved if i can roll these up together ...
So far, I've tried to use CTE with RowNum function to join the consecutive rows and get the time diff but it isn't working out..
For Ex:
RowId Id StartTime EndTime
135 8526 9/1/2013 17:35:07 9/1/2013 17:38:36
136 8526 9/1/2013 18:47:18 9/1/2013 18:49:07
137 8526 9/2/2013 14:08:12 9/2/2013 14:08:15
138 8526 9/2/2013 14:10:33 9/2/2013 14:10:44
139 8526 9/2/2013 14:12:25 9/2/2013 14:12:34
140 8526 9/2/2013 14:12:38 9/2/2013 14:13:00
141 8526 9/2/2013 14:13:11 9/2/2013 14:13:12
142 8526 9/2/2013 14:13:13 9/2/2013 14:13:15
143 8526 9/2/2013 14:13:31 9/2/2013 14:14:37
144 8526 9/2/2013 14:14:44 9/2/2013 14:15:14
145 8526 9/2/2013 14:15:18 9/2/2013 14:15:30
146 8526 9/2/2013 14:15:36 9/2/2013 14:15:43
147 8526 9/2/2013 14:15:48 9/2/2013 14:16:02
148 8526 9/2/2013 14:16:06 9/2/2013 14:16:15
149 8526 9/2/2013 14:18:53 9/2/2013 14:18:55
150 8526 9/2/2013 14:19:04 9/2/2013 14:19:07
151 8526 9/2/2013 14:19:15 9/2/2013 14:19:20
152 8526 9/2/2013 14:19:24 9/2/2013 14:19:25
153 8526 9/2/2013 14:19:36 9/2/2013 14:19:49
154 8526 9/2/2013 14:19:54 9/2/2013 14:19:58
155 8526 9/2/2013 14:20:02 9/2/2013 14:20:21
156 8526 9/2/2013 14:20:33 9/2/2013 14:20:42
157 8526 9/2/2013 14:27:13 9/2/2013 14:27:22
158 8526 9/2/2013 16:34:10 9/2/2013 16:35:23
Output:
135 8526 9/1/2013 17:35:07 9/1/2013 17:38:36
136 8526 9/1/2013 18:47:18 9/1/2013 18:49:07
137 8526 9/2/2013 14:08:12 9/2/2013 14:27:22
158 8526 9/2/2013 16:34:10 9/2/2013 16:35:23
ASKER
I'm not sure if you misunderstood my question.... I want to get all the values but combine only those where the time difference between end time and start time of the contiguous row is less than 10 mins
Define 'combine'. If I'm reading this correctly, since we're not assigning new ID values based on whether time_variance > 10 or not, the solution should still work.
I think you're going to have to use a cursor to accomplish this if the cte method isn't working. Just cycle through the records ordered appropriately. Compare the current record with the previous (save the pertinent values to variables) and either discard or insert the current record in a temp table.
ASKER
well, i was doing the same with CTE but wasn't able to get the correct results... I'm sure i am missing something in the logic... Can you help me with the code
Thanks
Thanks
Can you post some schema information or the cte code you were trying?
ASKER
Here's the code i have so far.... the output is incorrect as its not getting the correct values for the 1st and the last time intervals .. the code gets the Min and Max RowId from which i'm to get the time stamps in the next step
WITH CTE AS
(
SELECT *,(Row_Number() OVER (PARTITION BY ModelKey ORDER BY StartTime)) as RowNum
--, CASE WHEN TotalSeconds > 600 THEN 0 ELSE 1 END as SameOutage
FROM SampleData
WHERE ModelKey = 8526
)
,CTERowNum AS
(
SELECT X.*, ABS(DATEDIFF(SS,EndTime,NxtStartTime)) as Diff
, CASE WHEN ABS(DATEDIFF(SS,EndTime,NxtStartTime)) < 600 THEN 1 ELSE 0 END as SameOutage
FROM
(
SELECT C.RowId,C.ModelKey, C.StartTime, C.EndTime,C.RowNum ,C2.StartTime as NxtStartTime, C2.EndTime as NxtEndTime,C2.RowNum as NxtRowNum
FROM CTE C
LEFT JOIN CTE C2
ON C.ModelKey = C2.ModelKey AND C.RowNum = C2.RowNum-1 and C.EndTime < C2.StartTime
)X
)
SELECT MIN(RowId) as Id,ModelKey, MIN(StartTime) as StartTime, Min(TotalSec),
CASE WHEN MIN(TotalSec) > 600 THEN MAX(EndTime) ELSE MAX(NxtEndTime) END as NxtEnd, MIN(MinRow) as MinRow, MAX(MaxRow) as MaxRow
FROM
(
SELECT RowId,ModelKey,StartTime,EndTime,NxtStartTime,NxtEndTime, ABS(DATEDIFF(SS,NxtStartTime,EndTime)) as TotalSec
, (select TOP 1 RowId FROM CTERowNum C2 WHERE C2.ModelKey = C.ModelKey and isnull(C2.SameOutage,0)<>1 and C2.RowNum < C.RowNum Order by RowId desc) as MinRow
, (select TOP 1 RowId FROM CTERowNum C2 WHERE C2.ModelKey = C.ModelKey and isnull(C2.SameOutage,0)<>1 and C2.RowNum > C.RowNum Order by RowId asc)-1 as MaxRow
FROM CTERowNum C
)X
GROUP BY ModelKey,MinRow,MaxRow
Order by StartTime
Sample.xlsx
I'm working on a cursor solution but it's going to take me a while to get it written and tested.
dannygonzalez09 - Let me know if you intend to read my article or not, as it uses CTE's and looks similar to what you have posted above.
ASKER
Thanks BriCrowe
@Jimhorn - Yes, i looked at your article as soon as you posted it.... it basically does the same thing that i was working on upto an extent, but i couldn't make mine work yet and i am stuck..
Sorry, but it din't help me much
@Jimhorn - Yes, i looked at your article as soon as you posted it.... it basically does the same thing that i was working on upto an extent, but i couldn't make mine work yet and i am stuck..
Sorry, but it din't help me much
Okay try this...change the tablename "myEvent" to your tablename
Let me know if you have any problems. I tested it against a sample set with a second ID value.
Let me know if you have any problems. I tested it against a sample set with a second ID value.
SET NOCOUNT ON;
DECLARE @ID0 INT,
@StartTime0 DATETIME,
@EndTime0 DATETIME,
@IDPrev INT = NULL,
@StartTimePrev DATETIME = NULL,
@EndTimePrev DATETIME = NULL,
@IDCurr INT = NULL,
@StartTimeCurr DATETIME = NULL,
@EndTimeCurr DATETIME = NULL,
@Interval INT, --Interval Threshold in seconds
@RowCount INT
DECLARE @CondensedEvents TABLE
(
ID INT,
StartTime DATETIME,
EndTime DATETIME
--Include any other columns here
)
SELECT @Interval = 600 --10 minutes
DECLARE crsEvent CURSOR FOR
SELECT ID, StartTime, EndTime
FROM myEvent
ORDER BY ID, EndTime
OPEN crsEvent
FETCH NEXT FROM crsEvent
INTO @ID0, @StartTime0, @EndTime0
WHILE @@FETCH_STATUS = 0
BEGIN
IF @IDCurr IS NULL
BEGIN
PRINT 'No Current Record'
FETCH NEXT FROM crsEvent
INTO @IDCurr, @StartTimeCurr, @EndTimeCurr
END
PRINT '0 - ID[' + ISNULL(CAST(@ID0 AS VARCHAR), '') + '] StartTime[' + ISNULL(CONVERT(VARCHAR, @StartTime0, 120), '') + '] EndTime[' + ISNULL(CONVERT(VARCHAR, @EndTime0, 120), '') + ']'
PRINT 'P - ID[' + ISNULL(CAST(@IDPrev AS VARCHAR), '') + '] StartTime[' + ISNULL(CONVERT(VARCHAR, @StartTimePrev, 120), '') + '] EndTime[' + ISNULL(CONVERT(VARCHAR, @EndTimePrev, 120), '') + ']'
PRINT 'C - ID[' + ISNULL(CAST(@IDCurr AS VARCHAR), '') + '] StartTime[' + ISNULL(CONVERT(VARCHAR, @StartTimeCurr, 120), '') + '] EndTime[' + ISNULL(CONVERT(VARCHAR, @EndTimeCurr, 120), '') + ']'
IF @ID0 = @IDCurr
BEGIN
--PRINT 'Same ID Value: ' + CAST(@IDCurr AS VARCHAR)
IF DATEDIFF(SECOND, @EndTime0, @EndTimeCurr) > @Interval
BEGIN
PRINT 'Interval Exceeded[' + CAST(DATEDIFF(SECOND, @EndTime0, @EndTimeCurr) AS VARCHAR) + ']'
--Record the condensed record
PRINT 'Inserting record matching: ID[' + CAST(ISNULL(@IDPrev, @IDCurr) AS VARCHAR) +
'] StartTime[' + CONVERT(VARCHAR, ISNULL(@StartTimePrev, @StartTimeCurr), 120) +
'] EndTime[' + CONVERT(VARCHAR, ISNULL(@EndTimePrev, @EndTimeCurr), 120) + ']'
INSERT INTO @CondensedEvents
(
ID,
StartTime,
EndTime
)
SELECT ID, StartTime, EndTime --Other columns
FROM myEvent
WHERE ID = ISNULL(@IDPrev, @IDCurr)
AND StartTime = ISNULL(@StartTimePrev, @StartTimeCurr)
AND EndTime = ISNULL(@EndTimePrev, @EndTimeCurr)
SET @RowCount = @@ROWCOUNT
IF @RowCount = 0 --This shouldn't happen but I had a bug I had to track down
BEGIN
PRINT 'NO MATCHING RECORD FOUND'
END
SELECT @ID0 = @IDCurr,
@StartTime0 = @StartTimeCurr,
@EndTime0 = @EndTimeCurr
END
ELSE --Still within range of interval
BEGIN
PRINT 'Interval NOT Exceeded'
SELECT @IDPrev = @IDCurr,
@StartTimePrev = @StartTimeCurr,
@EndTimePrev = @EndTimeCurr
END
END
ELSE --Reset everything for the new ID value
BEGIN
SELECT @ID0 = @IDCurr,
@StartTime0 = @StartTimeCurr,
@EndTime0 = @EndTimeCurr,
@IDPrev = NULL, @StartTimePrev = NULL, @EndTimePrev = NULL,
@IDCurr = NULL, @StartTimeCurr = NULL, @EndTimeCurr = NULL
END
--Get the next record
FETCH NEXT FROM crsEvent
INTO @IDCurr, @StartTimeCurr, @EndTimeCurr
PRINT ''
END
CLOSE crsEvent
DEALLOCATE crsEvent
SELECT * FROM myEvent WHERE ID = 8526
SELECT * FROM @CondensedEvents
ASKER
Thanks BriCrowe
I ran the script but looks like its not getting what i need (probably because i messed up something)
I've found another useful link and seems to almost get the result set i need... i still need to change it to calculate the time difference using the Start and Endtime (of the next row) instead of using start times
http://sqlmag.com/t-sql/grouping-web-site-hits-sessions
I ran the script but looks like its not getting what i need (probably because i messed up something)
I've found another useful link and seems to almost get the result set i need... i still need to change it to calculate the time difference using the Start and Endtime (of the next row) instead of using start times
http://sqlmag.com/t-sql/grouping-web-site-hits-sessions
I see what I was doing wrong...I'll make the adjustments tomorrow.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thanks BriCrowe
Perfect, exactly what i needed...
I was also able to modify the script (from the link i posted yesterday) to get all the output i needed, currently testing its performance which so far has been good
I'll compare the performance on both and use the best, will post a comment soon if i need more help on your SQL
Thanks again
Perfect, exactly what i needed...
I was also able to modify the script (from the link i posted yesterday) to get all the output i needed, currently testing its performance which so far has been good
I'll compare the performance on both and use the best, will post a comment soon if i need more help on your SQL
Thanks again
SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Thank you BriCrowe and Paul
No problem, thanks for the assist. Cheers, Paul.
ASKER
BriCrowe, I've been using your script which worked like a charm but new data is messing up the output...can you please help
I've opened another question here
https://www.experts-exchange.com/questions/28280456/Roll-up-data-using-the-time-difference-between-contiguous-rows-2.html
Thanks
I've opened another question here
https://www.experts-exchange.com/questions/28280456/Roll-up-data-using-the-time-difference-between-contiguous-rows-2.html
Thanks
Feel free to copy this code into your SSMS, run to verify it works, then change the DATEDIFF line so that it does minutes instead of days. Then, edit the final SELECT so that it shows only rows where the time_variance > 10.
Enjoy.