Solved

Can you think of a way not to use a cursor here?

Posted on 2011-03-25
12
279 Views
Last Modified: 2012-06-21
Here's a good logic challenge:

I have to come up with figures for room occupancy based on timetables data.

For most rooms this is easy. No two events can be timetabled at once so you just add up the all booking durations and subtract that from the available hours.

It gets tricky when it comes to the rooms where we do allow concurrent bookings. Unfortunately it's extra important these are included in the report because they are usually large, expensive teaching workshops where optimal usage is particularly relevant.

I am sure code for this has been written many times before.
As far as I can see what I am essentially trying to do is just find which blocks stretches of time have no activity in them. The most distilled procedural description of this is as follows:

For each day in each room in each week sort all bookings by start time.
Step through checking on each row how much time elapsed between the start time and the latest end time so far

It's that "so far" bit that has lost me the hope of doing this without a cursor. If bookings were of uniform length I could rely on the start time sort effectively sorting the end times and therefore I could always just compare to the previous row - something which can be done set-based.
The fact is that the the big room booking which reaches way out until the end of the day may also have been the very first booking of the day so comparing Booking Four starting at midday with the previous Booking Three which ended at 11:30 will count half an hour unused time which isn't the case.

The only alternative to this cursor is a very overcomplicated set of operations comparing All with All, to batch together clusters of bookings then identify the outer limits of these clusters. Finding the overlaps is easy but whittling down and grouping and categorising the types of overlaps, using different approaches for instance if 3 events are concurrent to when 3 events are in a cluster but one overlaps with the other 2 but those 2 don't overlap with each other.

However I might have missed a trick and be wrong about the necessary unwieldiness of the above. If there's some good catch-all set-based logic somebody has used for something similar your help would be greatly appreciated. I've not used cursors before and I don't want to slog through learning it just to find going through a 69000 row temp table (that's just for the rooms with concurrent booking) is just too stupidly slow even for an overnight procedure.

Many Thanks,

Adam
0
Comment
Question by:MISServices
  • 4
  • 4
  • 3
  • +1
12 Comments
 
LVL 4

Accepted Solution

by:
davehilditch earned 125 total points
Comment Utility
Can you provide some sample data and a sample table with the problem in it?

Here's what I'm thinking based on what you said so far:

Create a table called tbAvailableBookingHours which contains fifteen minute slots in it to cover the entire day - you could then join your rooms table to this table then left join to your booking table and then count the nulls.

0
 
LVL 4

Expert Comment

by:davehilditch
Comment Utility
Incidentally, cursors aren't all bad - I know people get religious about them, but for 69,000 rows if you use a fast_forward read_only cursor it should be fast enough to complete in a minute or two.  If the code ends up easier to use/read/maintain then it might be the better option.
0
 
LVL 75

Expert Comment

by:Anthony Perkins
Comment Utility
It really depends on your tables; without seeing your tables it is impossible to know for sure.  If they are well designed (by this I mean fully normalized) than a CURSOR should not be necessary, if not it may be your only hope.  
0
 
LVL 18

Assisted Solution

by:deighton
deighton earned 125 total points
Comment Utility
For a simple test table
CREATE TABLE [dbo].[Bookings](
	[StartTime] [datetime] NOT NULL,
	[EndTime] [datetime] NOT NULL
) ON [PRIMARY]

Open in new window



the following can count in minutes the total occupancy, counting any overlap as 1.  It calculates to the nearest minute
CTE - SQL SERVER 2005 and higher


DECLARE @PERIOD_START as DATETIME;
DECLARE @PERIOD_END as DATETIME;

SET @PERIOD_START = '01-01-2010';
SET @PERIOD_END = '01-01-2010 23:59';


WITH TIMERANGE AS 
	(
		SELECT @PERIOD_START AS Theminute 
		UNION ALL
		SELECT dateadd(minute,1,Theminute) FROM TIMERANGE WHERE Theminute<@PERIOD_END
	)
	SELECT COUNT(*) FROM TIMERANGE WHERE EXISTS(SELECT NULL FROM BOOKINGS B WHERE TIMERANGE.THEMINUTE >= B.StartTime AND TIMERANGE.THEMINUTE < B.EndTime)
	OPTION (MAXRECURSION 0);

Open in new window

0
 
LVL 18

Expert Comment

by:deighton
Comment Utility
so to explain my code above, the CTE creates 'TIMERANGE' which is effectively one record per minute of the range from start to end, then the query uses an 'exists' to find the 'minutes' where there was an occupancy (so multiple occupancy is only counted once, but zero occupancy is not counted).  Then the count of all the minutes (exclusive of the end, if you ended occupancy at 10:45 then 10:45 does not count) -  Is the duration in minutes where there was occupancy.

It worked in my test, you would tweak the exists clause to select occupancy for different rooms etc.  My table is very simple, having only occupancy start and end times - for illustration
0
 

Author Comment

by:MISServices
Comment Utility
Thanks guys,

acperkins, I'm probably missing what you're saying but I can't see how or why normalisation would lead us to a data structure that would package up and put neat labels on event clusters (or non-events) according to one particular definition (room timetable rather than teacher or student or course). Please do outline which structure would help with this if I've missed the point because we might be able to use it.

DaveHilditch, I kicked myself the second I saw your solution. Always the obvious stuff, eh?  That table would need about 750,000 rows but that's probably not too much to run quick.

Deighton, your solution appears beautiful and robust (and using SQL functionality I should have know more about) but that need for a tweak on the WHERE clause would be 22,000 tweaks for each room- day of the year. I've no doubt you could pull out the bag some way of doing that efficiently but 15 minute slots are enough resolution for me so I'll see how performant Dave's mere mortal approach is and get back to you.

Thanks!
0
Better Security Awareness With Threat Intelligence

See how one of the leading financial services organizations uses Recorded Future as part of a holistic threat intelligence program to promote security awareness and proactively and efficiently identify threats.

 

Author Comment

by:MISServices
Comment Utility
Not that I would dare call you a mere mortal Dave. That'd be me. Wouldn't want striking down
0
 
LVL 4

Expert Comment

by:davehilditch
Comment Utility
lol, how are you calculating your 750,000 row expectation?  Post your final bit of code back here and i'll recommend indexes if you like to make it fast.
0
 
LVL 18

Expert Comment

by:deighton
Comment Utility
if you had another field called 'RoomId', then the following groups by all rooms

DECLARE @PERIOD_START as DATETIME;
DECLARE @PERIOD_END as DATETIME;

SET @PERIOD_START = '01-01-2010';
SET @PERIOD_END = '01-10-2010 23:59';


WITH TIMERANGE AS 
	(
		SELECT @PERIOD_START AS Theminute 
		UNION ALL
		SELECT dateadd(minute,1,Theminute) FROM TIMERANGE WHERE Theminute<@PERIOD_END
	)
	SELECT ROOMID, 
                     (SELECT COUNT(*) FROM TIMERANGE
                          WHERE EXISTS(SELECT NULL FROM BOOKINGS B WHERE TIMERANGE.THEMINUTE >= B.StartTime AND TIMERANGE.THEMINUTE < B.EndTime AND B.RoomId = B1.Roomid)) as MinutesOccupancy 
    FROM BOOKINGS B1 group by B1.roomid
	OPTION (MAXRECURSION 0);

Open in new window

0
 
LVL 18

Expert Comment

by:deighton
Comment Utility
BTW - it could work in 15 minute slots, how is your data defined?  Did you use time, or have you given 15 minute slots their own id?  
0
 

Author Comment

by:MISServices
Comment Utility
Looks like Dave's approach runs in under 30 seconds.

I'll split the points because I took serious inspiration from deighton's code in creating the base timerange table using an int instead of datetime.

Thanks loads!
0
 

Author Comment

by:MISServices
Comment Utility
To answer those other points I didn't refresh to see a moment ago,

deighton, I have little doubt your approach modified to 15 min slots and tweaked to group for each room on each teaching day of the year would be just as quick, definitely more compact and now I think of it it's probably as good a time as any to get my head round how that WITH statement works rather than WHILE loop (yes I am a novice). My only concern is that the more advanced code I use the less likely it can be unpicked by a colleague later down the line. Maybe a poor excuse but I will try to get round to compacting it to your form.

The data isn't defined in 15 minute slots but the number of occasions in which something is not timetabled on the 15 minute mark is truly negligible.

Dave, the reason for 750,000 rows is that it's 32 15 minute slots (8 hour day) for each affected room for each teaching day of the year.


DECLARE  @15MinTimeRange table (StartMin int) 
    
DECLARE @StartMin int

SET @StartMin = 9 /*o'clock*/ * 60 /*minutes*/

WHILE @StartMin < 17 /*o'clock*/ * 60 /*minutes*/

BEGIN
	INSERT INTO @15MinTimeRange
	SELECT @StartMin
	SET @StartMin = @StartMin + 15
END


--------------
DROP TABLE #AvailSlots
SELECT DISTINCT FullYrWeekNumber, DoW, r_ID, StartMin

INTO #AvailSlots
FROM #SampleActs A
CROSS JOIN @15MinTimeRange Ran
---------------

SELECT
r_id,
(select r_reference from [UNITE-SQL\Ulive].Ulive.dbo.CAPD_room r with (nolock) where inn.r_id = r.r_id) RoomName, 
1 - ((CAST(SUM(NullCount) AS FLOAT) * 15) / (COUNT(*) * (8 /*hours*/ * 60 /*minutes*/))) Perc_Occupied

FROM

	(SELECT Av.FullYrWeekNumber, 
	Av.DoW, 
	Av.r_ID,
	Count(*) NullCount


	FROM #AvailSlots Av
	left join #SampleActs Act on

	av.FullYrWeekNumber = Act.fullYrWeekNumber
	and av.DoW = Act.DoW
	and av.r_ID = act.r_id
	and Act.Start <= Av.StartMin 
	and Act.[End] > Av.StartMin

	WHERE Act.R_id is null

	GROUP BY av.FullYrWeekNumber, av.DoW, av.r_ID) INN

GROUP BY R_ID

Open in new window



I've realised that code's not quite right in that it shows the occupancy of all days in rooms that at least have something activity. It should include all the blank space of the days without a single activity but that's no big deal for me to worry about and I don't want to delay replying any further.

Don't worry about indexes because even if it makes it 10 times faster that's 9 extra seconds our server really doesn't mind overnight (possibly a bad attitude but efficiency is a real luxury with our workload so I'm picking my battles).

This has been my first EE Q and considering how good you've been I might come back with another related question I'd thought I was just going to compromise on.

Ta!
0

Featured Post

How to run any project with ease

Manage projects of all sizes how you want. Great for personal to-do lists, project milestones, team priorities and launch plans.
- Combine task lists, docs, spreadsheets, and chat in one
- View and edit from mobile/offline
- Cut down on emails

Join & Write a Comment

Suggested Solutions

Title # Comments Views Activity
c# code 19 57
Ranking Based On Value 3 28
Classic ASP application Will support SQL 2014 5 30
Complex SQL 10 32
This article explains all about SQL Server Piecemeal Restore with examples in step by step manner.
Load balancing is the method of dividing the total amount of work performed by one computer between two or more computers. Its aim is to get more work done in the same amount of time, ensuring that all the users get served faster.
This videos aims to give the viewer a basic demonstration of how a user can query current session information by using the SYS_CONTEXT function
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

17 Experts available now in Live!

Get 1:1 Help Now