I currently have a VBA code loop that does this for me (so technically I have a working solution) but I'm hoping that there is an SQL solution that might be faster. I have about 1 million client records. They have a ServiceStartDate and a ServiceEndDate. The ServiceEndDate will be null if the service is still active. What I need is a recordset that contains 1 row for every day since around 1/1/09 when the company started showing the number of accounts active that day by UtilityID. Nice to have would also be the number of accounts added that day and accounts dropped that day. So input would be.
UtilityID, Acct #, ServiceStartDate, ServiceEndDate
1, abc, 1/1/9, 1/5/9
1, aaa, 1/3/9
1, aab, 1/3/9,
output would be
UtilityID, CalendarDate, Active, Added, Dropped
1, 1/1/9, 1,1,0
1, 1/2/9, 1, 0, 0
1, 1/3/9, 3, 2, 0
1, 1/4/9, 3, 0, 0
1, 1/5/9, 2,0, 1
1, 1/6/9, 2, 0, 0
....
So just taking UtilityID 1 for 3 accounts over 6 days, this is what I end up with. Every day, I have a count of the number of active accounts. Added and dropped if you can calculate it at the same time would be good but otherwise, it is pretty easy to calculate by summing the recordset by start and end dates so I can add that later. It is counting by day that is the problem.
I have tried this by joining to a table of days. There are about 3500 days in this time frame but a Cartesian Product of 1,000,000 x 3500 is huge and way too slow. I am hoping there is some slick T-SQL syntax that can generate these intermediate records. The Cartesian Product method works fine for smaller recordsets but fails for one of this size. The current code loop does the process one day at a time.