w00k
asked on
SQL Server running total
Hello experts,
I have a talbe called cases. Sample of this table looks like this:
caseID startDate endDate
143 2011-07-01 12:20:12.317 2011-07-06 09:29:11.750
222 2011-07-01 12:57:17.490 2011-07-03 10:58:04.523
323 2011-07-01 14:31:50.290 2011-07-02 14:35:10.063
491 2011-07-01 15:30:16.060 2011-07-07 15:10:09.483
505 2011-07-01 16:29:14.030 2011-07-01 17:05:08.231
I'd like to know the easiest way for me to calculate a running total by day from this table. I'm thinking I would first need to take the min(startDate) and max(endDate) to figure out my range of days, but aside from that I'm not sure what should be done next.
If someone can provide a udf to do this that would be great. If not, the tsql code would be ok too. For the sample data above, this is what I'd like the running total to look like:
date running_total
07/01/2011 5
07/02/2011 4
07/03/2011 3
07/04/2011 2
07/05/2011 2
07/06/2011 2
07/07/2011 1
07/08/2011 0
Please let me know if I haven't explained this well. I can provide more details if needed.
Thanks in advance for your help.
I have a talbe called cases. Sample of this table looks like this:
caseID startDate endDate
143 2011-07-01 12:20:12.317 2011-07-06 09:29:11.750
222 2011-07-01 12:57:17.490 2011-07-03 10:58:04.523
323 2011-07-01 14:31:50.290 2011-07-02 14:35:10.063
491 2011-07-01 15:30:16.060 2011-07-07 15:10:09.483
505 2011-07-01 16:29:14.030 2011-07-01 17:05:08.231
I'd like to know the easiest way for me to calculate a running total by day from this table. I'm thinking I would first need to take the min(startDate) and max(endDate) to figure out my range of days, but aside from that I'm not sure what should be done next.
If someone can provide a udf to do this that would be great. If not, the tsql code would be ok too. For the sample data above, this is what I'd like the running total to look like:
date running_total
07/01/2011 5
07/02/2011 4
07/03/2011 3
07/04/2011 2
07/05/2011 2
07/06/2011 2
07/07/2011 1
07/08/2011 0
Please let me know if I haven't explained this well. I can provide more details if needed.
Thanks in advance for your help.
What is the biggest gap in start and end dates possible? Just trying to determine the boundaries we may have for certain solutions that have limitations.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Another version, assuming you already have a "calendar table" of all the dates for which you want output in the query:
SELECT DATEADD(day, DATEDIFF(day, 0, startDate), 0) AS startDate,
DATEADD(day, DATEDIFF(day, 0, andDate), 0) AS endDate
INTO #cases
FROM cases
SELECT r.dt, COUNT(c.startDate) AS Cnt
FROM tblCalendar r LEFT JOIN
#cases c ON r.dt >= c.startDate AND r.dt <= c.endDate
GROUP BY r.dt
ORDER BY r.dt
DROP TABLE #cases
It appears Patrick was thinking the same thing. I was asking about range to see if recursion or use of built in numbers tables like master..spt_values may come in handy if you did not have your own calendar table already. Here is what the recursion would look like. It has its limits thought if your ranges can be large. For the simple case in the question it works like this:
;WITH cte(caseID, dt) AS
(
/* Get all the start dates; recursive anchor. */
SELECT caseID, CONVERT(DATETIME, DATEDIFF(day, 0, startDate))
FROM your_table
UNION ALL
/* Recursive query. */
SELECT c.caseID, DATEADD(day, 1, p.dt)
FROM your_table c
JOIN cte p ON p.caseID = c.caseID
WHERE c.endDate >= DATEADD(day, 1, p.dt)
)
SELECT dt as [date], COUNT(caseID) AS [running_total]
FROM cte
GROUP BY dt
;
ASKER
Thanks for the quick reply guys. I'll give both of these a try as soon as I get home...give me about an hour.
Thanks
Thanks
Cool. By the way, I think Patrick's already handles this properly, but the CTE approach I showed above does not add in the extra day 7/8 with 0.
You can either:
Or alter the CTE:
You can either:
;WITH cte(caseID, dt) AS
(
/* Get all the start dates; recursive anchor. */
SELECT caseID, CONVERT(DATETIME, DATEDIFF(day, 0, startDate))
FROM your_table
UNION ALL
/* Recursive query. */
SELECT c.caseID, DATEADD(day, 1, p.dt)
FROM your_table c
JOIN cte p ON p.caseID = c.caseID
WHERE c.endDate >= DATEADD(day, 1, p.dt)
)
SELECT dt as [date], COUNT(caseID) AS [running_total]
FROM cte
GROUP BY dt
/* Add in final zero row. */
UNION SELECT DATEADD(day, 1, MAX(dt)), 0 FROM cte
;
Or alter the CTE:
;WITH cte(caseID, dt, dt_end) AS
(
/* Get all the start dates; recursive anchor. */
SELECT caseID, CONVERT(DATETIME, DATEDIFF(day, 0, startDate)), endDate
FROM your_table
UNION ALL
/* Recursive query. */
SELECT c.caseID, DATEADD(day, 1, p.dt), c.endDate
FROM your_table c
JOIN cte p ON p.caseID = c.caseID
WHERE c.endDate >= p.dt
)
SELECT dt as [date]
, COUNT(CASE WHEN dt_end >= dt THEN caseID END) AS [running_total]
FROM cte
GROUP BY dt
;
ASKER
Thanks you for your help Matthewspatrick. Worked perfectly. I already had a calendar table so I was able to use your solution without any problems.
If i am understanding your display correctly, that does not appear to be a running total at all. It appears to be a simple COUNT() of the number of rows for a given day regardless of timestamp, is that correct? Or is it how many rows are still active over time, so on 7/1 all 5, then 7/2 only 4, and so on based on an explosion of rows based on range denoted by startDate and endDate?
Kevin