Solved

Elegant solution for a difficult SQL query

Posted on 2014-07-18
23
238 Views
Last Modified: 2014-07-19
Our software company has roughly 1500 customers.  Our Customers table contains purchase_date, which is the date the customer originally purchased our product.  I want to return a result set of rows for each year and month from the original purchase_date until now, containing the total number of customers as at the end of the month.  What makes it difficult is that in some months there were no sales, however I still want to have a row returned for those months (containing the same customer_count as the prior month).  Can anyone suggest an approach to solving this problem in a single SQL query?  Thanks, Pete
0
Comment
Question by:petekipe
  • 7
  • 5
  • 4
  • +4
23 Comments
 
LVL 65

Expert Comment

by:Jim Horn
ID: 40204330
>I want to return a result set of rows for each year and month from the original purchase_date until now, containing the total number of customers as at the end of the month.
SELECT YEAR(purchase_date), MONTH(purchase_date), SUM(amount), COUNT(customer_id)
FROM customers
GROUP BY YEAR(purchase_date), MONTH(purchase_date)
ORDER BY YEAR(purchase_date), MONTH(purchase_date)

Open in new window

>What makes it difficult is that in some months there were no sales, however I still want to have a row returned for those months
In that case you're going to have to have a table that holds every year-month combination in the range you want, then add this line between FROM and GROUP BY
SELECT YEAR(c.purchase_date), MONTH(c.purchase_date), SUM(c.amount), COUNT(c.customer_id)
FROM months m
     LEFT JOIN customers c ON m.year = YEAR(c.purchase_date) AND m.month = MONTH(c.purchase_date) 
GROUP BY YEAR(c.purchase_date), MONTH(c.purchase_date)
ORDER BY YEAR(c.purchase_date), MONTH(c.purchase_date)

Open in new window

 
>(containing the same customer_count as the prior month)
Not certain how to pull that off, but the requirement sounds wrong.  It is highly unlikely that this can be done in a single query, unless you have SQL 2012 and can build a CASE block that uses the LAG keyword to grab the previous value.  Something like <total air code>
... CASE WHEN SUM(c.amount = 0 then LAG(COUNT(c.customer_id)) ELSE COUNT(c.customer_id) END ...

Open in new window

0
 
LVL 26

Expert Comment

by:Shaun Kline
ID: 40204363
If you are using SQL Server 2005+, you can use a CTE to generate your "months" table. Adjust the date in the e4 line to your purposes.
;WITH e1(n) AS
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS c), -- 100*100
e4(n) AS (SELECT DATEADD(MONTH, (ROW_NUMBER() OVER (ORDER BY n)) -1, '1/1/2010') FROM e3)
SELECT YEAR(e4.n), MONTH(e4.n)
FROM e4
where e4.n < GETDATE()

Open in new window


One caveat about Jim's last line of code: LAG is only available in 2012+. (I had to look it up as I'd never seen it.)
0
 
LVL 21

Expert Comment

by:Randy Poole
ID: 40204368
Why would you want it to contain a count of the previous month?  Since this does not have a running total is should show zero or is there a reason?
0
 

Author Comment

by:petekipe
ID: 40204395
Thanks Jim, for your quick response.  You've shown me essentially what I was doing before, although your code is a lot simpler than mine.  But I'm still faced with the problem of creating rows in months where no purchase activity exists.  I may have to stick with my old approach, which was to reprocess the result set in VB code, outputting an array.  I've always thought that approach was a little messy, but if it's the only way I can get what I need, then that's the answer.  The ultimate purpose of this need is to feed a Chart object that shows sales growth over time, and can then be used to project future sales using a "least squares" method.  Thanks again!
0
 
LVL 21

Expert Comment

by:Randy Poole
ID: 40204408
You can probably combine their solutions as:
;WITH e1(n) AS
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS c), -- 100*100
e4(n) AS (SELECT DATEADD(MONTH, (ROW_NUMBER() OVER (ORDER BY n)) -1, '1/1/2010') FROM e3)
SELECT YEAR(e4.n), MONTH(e4.n)
FROM e4
where e4.n < GETDATE()
SELECT YEAR(e4.n), MONTH(e4.n), SUM(c.amount), COUNT(c.customer_id)
FROM e4 
     LEFT JOIN customers c ON YEAR(e4.n) = YEAR(c.purchase_date) AND MONTH(e4.n) = MONTH(c.purchase_date) 
GROUP BY YEAR(c.purchase_date), MONTH(c.purchase_date)
ORDER BY YEAR(c.purchase_date), MONTH(c.purchase_date)

Open in new window

0
 
LVL 65

Expert Comment

by:Jim Horn
ID: 40204416
>But I'm still faced with the problem of creating rows in months where no purchase activity exists.
The only (OWN-LEE) way to pull this off in T-SQL would be to create a table with all valid months, then LEFT JOIN from there so that all rows in that table are returned.  Otherwise, SQL Server does not have 'include every month in this query, whether there is data returned or not' functionality.

If it helps, although it's kind of a tangent, I have an article on How to build you own SQL Calendar table that does this at the day level.  A table at the month level would obviously be a lot easier.
0
 

Author Comment

by:petekipe
ID: 40204423
Randy, I tried your last suggestion, however I'm getting "e4.n could not be bound" error on all references in the final SELECT statement.  The following is an updated version of your code, with actual table and field names:

;WITH e1(n) AS
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS c), -- 100*100
e4(n) AS (SELECT DATEADD(MONTH, (ROW_NUMBER() OVER (ORDER BY n)) -1, '1/1/2005') FROM e3)
SELECT YEAR(e4.n), MONTH(e4.n)
FROM e4
WHERE e4.n < GETDATE()
SELECT YEAR(e4.n), MONTH(e4.n), COUNT(c.pk_station)
FROM e4
     LEFT JOIN Station c ON YEAR(e4.n) = YEAR(c.purchased_date) AND MONTH(e4.n) = MONTH(c.purchased_date)
GROUP BY YEAR(c.purchased_date), MONTH(c.purchased_date)
ORDER BY YEAR(c.purchased_date), MONTH(c.purchased_date)

Can you tell me how to fix the error?
0
 
LVL 35

Assisted Solution

by:Robert Schutt
Robert Schutt earned 125 total points
ID: 40204425
I see 2 other options:
1) using a recursive CTE;
2) using an existing table to create the extra rows for 'empty' months (the well known "spt_values trick"). The problem with these might be that they are not all that elegant or maybe even efficient, but with 1500 records this is not a big worry I feel.

option 1:
;WITH CTE AS (
	SELECT YEAR(MIN(purchase_date)) * 100 + MONTH(MIN(purchase_date)) AS Period
	FROM Customers
	UNION ALL
	SELECT CASE WHEN Period % 100 = 12 THEN Period + 89 ELSE Period + 1 END
	FROM CTE
	WHERE Period < YEAR(GETDATE()) * 100 + MONTH(GETDATE())
)
SELECT Period, COUNT(purchase_date)
FROM CTE
LEFT JOIN Customers ON YEAR(purchase_date) * 100 + MONTH(purchase_date) <= Period
GROUP BY Period
ORDER BY Period
OPTION (MAXRECURSION 1200) -- 100 years...

Open in new window

option 2:
SELECT CONVERT(varchar(7), DATEADD(m, v.number, DATEADD(D, -DATEPART(D, (SELECT MIN(purchase_date) FROM Customers)), CAST((SELECT MIN(purchase_date) FROM Customers) AS DATE))), 126) AS [Year + Month], COUNT(purchase_date) AS [Cumulative Count]
FROM master..spt_values v 
LEFT JOIN Customers on purchase_date < DATEADD(m, v.number, DATEADD(D, 1-DATEPART(D, (SELECT MIN(purchase_date) FROM Customers)), CAST((SELECT MIN(purchase_date) FROM Customers) AS DATE)))
WHERE v.type='p' AND v.number > 0 AND v.number <= DATEDIFF(m, (SELECT MIN(purchase_date) FROM Customers), GETDATE()) + 1
GROUP BY v.number

Open in new window

0
 
LVL 21

Expert Comment

by:Randy Poole
ID: 40204458
remove this line:
SELECT YEAR(e4.n), MONTH(e4.n)
FROM e4
WHERE e4.n < GETDATE()

;WITH e1(n) AS
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS c), -- 100*100
e4(n) AS (SELECT DATEADD(MONTH, (ROW_NUMBER() OVER (ORDER BY n)) -1, '1/1/2005') FROM e3)
SELECT YEAR(e4.n), MONTH(e4.n), COUNT(c.pk_station)
FROM e4 
     LEFT JOIN Station c ON YEAR(e4.n) = YEAR(c.purchased_date) AND MONTH(e4.n) = MONTH(c.purchased_date)  where e4.n<GetDate() 
GROUP BY YEAR(c.purchased_date), MONTH(c.purchased_date)
ORDER BY YEAR(c.purchased_date), MONTH(c.purchased_date)

Open in new window

0
 

Author Comment

by:petekipe
ID: 40204493
Robert, option 2 solves the problem.  I made changes to actual table and field names, and added one additional test to the last WHERE clause.  Here's my final code:

SELECT CONVERT(varchar(7), DATEADD(m, v.number, DATEADD(D, -DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE))), 126) AS [Year + Month], COUNT(purchased_date) AS [Cumulative Count]
FROM master..spt_values v 
LEFT JOIN Station on purchased_date < DATEADD(m, v.number, DATEADD(D, 1-DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE)))
WHERE purchased = 'True' AND v.type='p' AND v.number > 0 AND v.number <= DATEDIFF(m, (SELECT MIN(purchased_date) FROM Station), GETDATE()) + 1
GROUP BY v.number

Open in new window

One question:  What should I change to return the year and month as separate columns?
0
 
LVL 35

Expert Comment

by:Robert Schutt
ID: 40204515
How about:
SELECT YEAR(DATEADD(m, v.number, DATEADD(D, -DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE)))) AS [Year],
	MONTH(DATEADD(m, v.number, DATEADD(D, -DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE)))) AS [Month],
	COUNT(purchased_date) AS [Cumulative Count]
FROM master..spt_values v 
LEFT JOIN Station on purchased_date < DATEADD(m, v.number, DATEADD(D, 1-DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE)))
WHERE purchased = 'True' AND v.type='p' AND v.number > 0 AND v.number <= DATEDIFF(m, (SELECT MIN(purchased_date) FROM Station), GETDATE()) + 1
GROUP BY v.number

Open in new window

0
Complete VMware vSphere® ESX(i) & Hyper-V Backup

Capture your entire system, including the host, with patented disk imaging integrated with VMware VADP / Microsoft VSS and RCT. RTOs is as low as 15 seconds with Acronis Active Restore™. You can enjoy unlimited P2V/V2V migrations from any source (even from a different hypervisor)

 
LVL 48

Expert Comment

by:PortletPaul
ID: 40204530
You can't create month records that don't exist just by using your sales data alone. It needs help, and a left join.

If you are doing this type of analysis regularly then creating a calendar table does produce the simplest result overall (see Jim's article as a sample for doing this). Just make the calendar table for sufficient years so you don't need to worry about it until you retire.

If however you are only doing this infrequently, and you are probably only doing this for one (or just a few) years, then generating a small range of months really isn't hard or costly using a recursive CTE such as option 2 by Robert Schutt. It doesn't have to be for 100 years.

here's another sample
declare @BeginsAt as datetime = '2012-01-12'
declare @numMonths as int = 36

set @BeginsAt  = dateadd(month,datediff(month,0,@BeginsAt),0) -- force to 1st of Month

;with MonthRange (id, StartAt, StopAt)
as (
    select 1 as id, @BeginsAt, dateadd(month,1,@BeginsAt)
    union all
    select (id + 1) , dateadd(month,1,StartAt) , dateadd(month,1,StopAt)
    from MonthRange
    where (id + 1) <= @numMonths
    )
select
      MonthRange.StartAt
      ...
from MonthRange
left join YourData as d on ( d.a_date_field >= StartAt and d.a_date_field < StopAt )
group by
      MonthRange.StartAt

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40204566
just adjust the first 2 parameters to suit
declare @BeginsAt as datetime = '20120101' -- use YYYMMDD if you can
declare @numMonths as int = 36

set @BeginsAt  = dateadd(month,datediff(month,0,@BeginsAt),0) -- force to 1st of Month

;with MonthRange (id, StartAt, StopAt)
as (
    select 1 as id, @BeginsAt, dateadd(month,1,@BeginsAt)
    union all
    select (id + 1) , dateadd(month,1,StartAt) , dateadd(month,1,StopAt)
    from MonthRange
    where (id + 1) <= @numMonths
    )
select
        YEAR(MonthRange.StartAt) as Yr
      , MONTH(MonthRange.StartAt) as Mnth
      , COUNT(purchased_date) AS [Cumulative Count]
from MonthRange
LEFT JOIN Station as d on ( d.purchased_date >= StartAt and d.purchased_date < StopAt )
group by
        YEAR(MonthRange.StartAt) as Yr
      , MONTH(MonthRange.StartAt) as Mnth
;

Open in new window

0
 
LVL 69

Assisted Solution

by:ScottPletcher
ScottPletcher earned 125 total points
ID: 40204921
Please try code below.  Adjust the starting and ending months as needed.

Comments on the methods used:
1) The CROSS JOIN method is vastly more efficient at generating rows vs. a recursive query (as conclusively demonstrated by Itzik Ben-Gan).
2) Using a system table(s), which you (a) don't control, (b) can't guarantee will be present or worse, complete, in the future, and (c) which the user might not even have access to is an awful coding practice.
3) A pre-built tally table clustered on the tally value is OK, but it's still much more overhead than an in-line CROSS JOIN.


DECLARE @starting_month datetime
DECLARE @ending_month datetime

SELECT @starting_month = DATEADD(MONTH, DATEDIFF(MONTH, 0, MIN(purchase_date)), 0)
FROM dbo.customers
SET @ending_month = DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()) + 1, 0)

;WITH
cteTally10 AS (
    SELECT 0 AS tally UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally100 AS (
    SELECT 1 AS tally
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
),
cteTally10K AS (
    SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) - 1 AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
)
SELECT
    DATEADD(MONTH, t.tally, @starting_month) AS Month,
    COUNT(*) AS Customer_Count
FROM cteTally10K t
INNER JOIN dbo.customers c ON
    c.purchase_date < DATEADD(MONTH, t.tally + 1, @starting_month)
WHERE
    t.tally BETWEEN 0 AND DATEDIFF(MONTH, @starting_month, @ending_month)
GROUP BY
    DATEADD(MONTH, t.tally, @starting_month)
ORDER BY
    Month
0
 

Author Comment

by:petekipe
ID: 40205248
To all,

Many thanks for all the thoughtful suggestions and comments.  This has turned out to be a very interesting thread, even to a less-than-proficient SQL coder like me.

I tried all of the examples that I could get to work in my environment (SQL Server 2014 Express), and so far, option 2 by Robert Schutt has been the easiest for me to implement.  I realize that there may be potential drawbacks from his approach, but my app will run exclusively in my company's environment, against a single database, and it's working fine.  Response time is extremely good, given that I have only a small number of rows involved.

The only additional thing I would wish for would be to have the result set begin at month 1 (January) of the first purchased_date year.  I have the result set filling a DataTable, which is bound to a Chart control on a form.  I'd like to be able to set the X-axis grid to 12, separating the chart into years, with each year starting in January. Can anyone show me how to modify the code below to accomplish that?

SELECT CONVERT(varchar(7), DATEADD(m, v.number, DATEADD(D, -DATEPART(D, (SELECT MIN(purchased_date)
FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE))), 126) AS [Year + Month], COUNT(purchased_date) AS [Cumulative Count]
FROM master..spt_values v
LEFT JOIN Station on purchased_date < DATEADD(m, v.number, DATEADD(D, 1-DATEPART(D, (SELECT MIN(purchased_date) FROM Station)), CAST((SELECT MIN(purchased_date) FROM Station) AS DATE)))
WHERE purchased = 'True' AND v.type='p' AND v.number > 0 AND v.number <= DATEDIFF(m, (SELECT MIN(purchased_date) FROM Station), GETDATE()) + 1
GROUP BY v.number

Open in new window

Many thanks in advance to all!
0
 
LVL 35

Expert Comment

by:Robert Schutt
ID: 40205404
I had the change the structure a little bit for that, the count is now a subquery:
SELECT CONVERT(varchar(7), DATEADD(m, v.number-1, CONVERT(DATE, CONVERT(char(4), (SELECT MIN(purchased_date) FROM Station), 126) + '-01-01', 126))) AS [Year + Month]
, (SELECT COUNT(1) FROM Station WHERE purchased = 'True' AND purchased_date < DATEADD(m, v.number, CONVERT(DATE, CONVERT(char(4), (SELECT MIN(purchased_date) FROM Station), 126) + '-01-01', 126))) AS [Cumulative Count]
FROM master..spt_values v
WHERE v.type='p' AND v.number > 0 AND v.number <= DATEDIFF(M, CONVERT(DATE, CONVERT(char(4), (SELECT MIN(purchased_date) FROM Station), 126) + '-01-01', 126), GETDATE()) + 1
GROUP BY v.number
ORDER BY v.number

Open in new window

0
 
LVL 48

Assisted Solution

by:PortletPaul
PortletPaul earned 250 total points
ID: 40205870
"expected results" oh how I miss them.

Instead of words, a small sample of what you would like in tabular format would "speak volumes".
Yr	Mnth	PerMnth	CumInYr	CumCum
2011	1	10	10	10
2011	2	10	20	20
2011	3	10	30	30
2011	4	10	40	40
2011	5	10	50	50
2011	6	10	60	60
2011	7	10	70	70
2011	8	10	80	80
2011	9	10	90	90
2011	10	10	100	100
2011	11	10	110	110
2011	12	10	120	120
2012	1	10	10	130
2012	2	10	20	140
2012	3	10	30	150
2012	4	10	40	160
2012	5	10	50	170
2012	6	10	60	180
2012	7	10	70	190
2012	8	10	80	200
2012	9	10	90	210
2012	10	10	100	220
2012	11	10	110	230
2012	12	10	120	240

Open in new window

So, do you want
 PerMnth counts?
 CumInYr counts?
 CumCum counts?

So far most scripts have produced CumCum.

I would suggest you really need PerMnth then the rest can be easily derived (many graphing tools will do it for you)

Below is a re-work of Scott Pletchers that will start at Jan of the earliest year and finish at December of the current year, suitable for SQL Server 2012. Note that this now uses datefromparts() and it requires a left join instead of inner because of the Jan-01-MinYear start point.

DECLARE @starting_month datetime
DECLARE @ending_month datetime

SET @starting_month = (select MIN(purchased_date) FROM dbo.station)
SET @starting_month = datefromparts(year(@starting_month),1,1)
SET @ending_month = datefromparts(year(getdate()),12,31)


;WITH
cteTally10 AS (
    SELECT 0 AS tally UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally100 AS (
    SELECT 1 AS tally
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
),
cteTally10K AS (
    SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) - 1 AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
)
SELECT
      Year(DATEADD(MONTH, t.tally, @starting_month))  AS Yr
    , MONTH(DATEADD(MONTH, t.tally, @starting_month)) AS Mnth
    , COUNT(c.purchased_date)                         AS Customer_Count
FROM cteTally10K t
      LEFT JOIN dbo.station c
                  ON c.purchased_date < DATEADD(MONTH, t.tally + 1, @starting_month)
WHERE t.tally BETWEEN 0 AND DATEDIFF(MONTH, @starting_month, @ending_month)
GROUP BY
      Year(DATEADD(MONTH, t.tally, @starting_month))
    , MONTH(DATEADD(MONTH, t.tally, @starting_month))
ORDER BY
      Yr, Mnth

Open in new window

0
 

Author Comment

by:petekipe
ID: 40205919
Paul, your last solution is great, but the results don't start at month 1 of the earliest year.  They do include all remaining months in the current year, however that's not really necessary.  Here's your latest output run against my database:
2004	12	2
2005	1	3
2005	2	3
2005	3	3
2005	4	3
2005	5	4
2005	6	4
2005	7	4
2005	8	4
2005	9	5
2005	10	5
2005	11	7
2005	12	7
2006	1	7
2006	2	7
2006	3	7
2006	4	7
2006	5	7
2006	6	7
2006	7	7
2006	8	7
2006	9	7
2006	10	9
2006	11	11
2006	12	16
2007	1	17
2007	2	19
2007	3	19
2007	4	19
2007	5	19
2007	6	21
2007	7	23
2007	8	23
2007	9	23
2007	10	24
2007	11	24
2007	12	26
2008	1	30
2008	2	37
2008	3	42
2008	4	47
2008	5	51
2008	6	53
2008	7	54
2008	8	56
2008	9	57
2008	10	59
2008	11	64
2008	12	70
2009	1	77
2009	2	81
2009	3	84
2009	4	86
2009	5	86
2009	6	87
2009	7	91
2009	8	91
2009	9	92
2009	10	95
2009	11	97
2009	12	100
2010	1	103
2010	2	103
2010	3	114
2010	4	120
2010	5	129
2010	6	147
2010	7	161
2010	8	185
2010	9	195
2010	10	203
2010	11	209
2010	12	216
2011	1	218
2011	2	226
2011	3	234
2011	4	239
2011	5	246
2011	6	251
2011	7	257
2011	8	263
2011	9	277
2011	10	286
2011	11	289
2011	12	292
2012	1	297
2012	2	305
2012	3	315
2012	4	332
2012	5	355
2012	6	363
2012	7	368
2012	8	372
2012	9	384
2012	10	389
2012	11	398
2012	12	402
2013	1	416
2013	2	426
2013	3	435
2013	4	438
2013	5	447
2013	6	459
2013	7	486
2013	8	489
2013	9	494
2013	10	506
2013	11	511
2013	12	511
2014	1	516
2014	2	519
2014	3	527
2014	4	537
2014	5	545
2014	6	551
2014	7	551
2014	8	551
2014	9	551
2014	10	551
2014	11	551
2014	12	551

Open in new window

What I'm looking for is:
2004-01	0
2004-02	0
2004-03	0
2004-04	0
2004-05	0
2004-06	0
2004-07	0
2004-08	0
2004-09	0
2004-10	0
2004-11	0
2004-12	2
2005-01	3
2005-02	3
2005-03	3
2005-04	3
2005-05	4
2005-06	4
2005-07	4
2005-08	4
2005-09	5
2005-10	5
2005-11	7
2005-12	7
2006-01	7
2006-02	7
2006-03	7
2006-04	7
2006-05	7
2006-06	7
2006-07	7
2006-08	7
2006-09	7
2006-10	9
2006-11	11
2006-12	16
2007-01	17
2007-02	19
2007-03	19
2007-04	19
2007-05	19
2007-06	21
2007-07	23
2007-08	23
2007-09	23
2007-10	24
2007-11	24
2007-12	26
2008-01	30
2008-02	37
2008-03	42
2008-04	47
2008-05	51
2008-06	53
2008-07	54
2008-08	56
2008-09	57
2008-10	59
2008-11	64
2008-12	70
2009-01	77
2009-02	81
2009-03	84
2009-04	86
2009-05	86
2009-06	87
2009-07	91
2009-08	91
2009-09	92
2009-10	95
2009-11	97
2009-12	100
2010-01	103
2010-02	103
2010-03	114
2010-04	120
2010-05	129
2010-06	147
2010-07	161
2010-08	185
2010-09	195
2010-10	203
2010-11	209
2010-12	216
2011-01	218
2011-02	226
2011-03	234
2011-04	239
2011-05	246
2011-06	251
2011-07	257
2011-08	263
2011-09	277
2011-10	286
2011-11	289
2011-12	292
2012-01	297
2012-02	305
2012-03	315
2012-04	332
2012-05	355
2012-06	363
2012-07	368
2012-08	372
2012-09	384
2012-10	389
2012-11	398
2012-12	402
2013-01	416
2013-02	426
2013-03	435
2013-04	438
2013-05	447
2013-06	459
2013-07	486
2013-08	489
2013-09	494
2013-10	506
2013-11	511
2013-12	511
2014-01	516
2014-02	519
2014-03	527
2014-04	537
2014-05	545
2014-06	551
2014-07	551

Open in new window

BTW, at some point earlier today, I decided that Robert's original solution showing yyyy-mm as a single column was easier to work with in my application, so I reverted back to it.  And for the record, our first two sales were in December of 2004.  Here's the output I've been doing all this for:NumberOfCustomersByMonth.bmpThe grid lines got screwed up when I resized the screen grab, but I'm sure you get the gist of it.

I would definitely like to go with a solution that doesn't use tricks or gimmicks, but I also don't want to continue to bug the group when I have something that essentially works.  Everyone here has been incredibly helpful, and I sincerely appreciate it!
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 40205945
could you try again?

and/or see: http://sqlfiddle.com/#!6/01ce5/1

this has 2004-12-11 as the minimum date, but the result starts at 2004-01-01
that's what this bit does:

DECLARE @starting_month datetime
DECLARE @ending_month datetime

SET @starting_month = (select MIN(purchased_date) FROM dbo.station)
SET @starting_month = datefromparts(year(@starting_month),1,1)
SET @ending_month = datefromparts(year(getdate()),12,31)

& make sure it is using a LEFT join (Scott's original was an inner join).

I presume you can return the separate columns back to a single column
0
 

Author Comment

by:petekipe
ID: 40206302
Paul,

When I ran your sqlfiddle code I got a result set that started at month 1 of the oldest year, and ended at month 12 of the current year. Another minor point is that to get the correct counts, I need to include a test for "purchased = 'True'", because of customers that may have cancelled after originally purchasing. (We keep the original purchase date for references purposes, but change "purchased" to False. Also, as I mentioned before, I've decided I want to return year and month as a single column in yyyy-mm format.

So I made some changes -- I'll post the code below. After the changes, my ending point is where I want it to be -- the current month, and my counts are correct per the above. But the result set starts at the month of the first purchase, not month 1 of that year. Here's the code:
DECLARE @starting_month datetime
DECLARE @ending_month datetime

SET @starting_month = (select MIN(purchased_date) FROM dbo.station WHERE purchased = 'True')
SET @starting_month = datefromparts(year(@starting_month),1,1)
SET @ending_month = eomonth(getdate())

;WITH
cteTally10 AS (
    SELECT 0 AS tally UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
    SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
),
cteTally100 AS (
    SELECT 1 AS tally
    FROM cteTally10 c1
    CROSS JOIN cteTally10 c2
),
cteTally10K AS (
    SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) - 1 AS tally
    FROM cteTally100 c1
    CROSS JOIN cteTally100 c2
)
SELECT
      CONCAT(Year(DATEADD(MONTH, t.tally, @starting_month)),'-',
      FORMAT(MONTH(DATEADD(MONTH, t.tally, @starting_month)), '00')) AS YrMnth
    , COUNT(c.purchased_date)                         AS Customer_Count
FROM cteTally10K t
      LEFT JOIN dbo.station c
                  ON c.purchased_date < DATEADD(MONTH, t.tally + 1, @starting_month)
WHERE t.tally BETWEEN 0 AND DATEDIFF(MONTH, @starting_month, @ending_month) AND purchased = 'True'
GROUP BY
      Year(DATEADD(MONTH, t.tally, @starting_month))
    , MONTH(DATEADD(MONTH, t.tally, @starting_month))
ORDER BY
      YrMnth

Open in new window

...and here's the output:
2004-12	2
2005-01	3
2005-02	3
2005-03	3
2005-04	3
2005-05	4
2005-06	4
2005-07	4
2005-08	4
2005-09	5
2005-10	5
2005-11	7
2005-12	7
2006-01	7
2006-02	7
2006-03	7
2006-04	7
2006-05	7
2006-06	7
2006-07	7
2006-08	7
2006-09	7
2006-10	9
2006-11	11
2006-12	16
2007-01	17
2007-02	19
2007-03	19
2007-04	19
2007-05	19
2007-06	21
2007-07	23
2007-08	23
2007-09	23
2007-10	24
2007-11	24
2007-12	26
2008-01	30
2008-02	37
2008-03	42
2008-04	47
2008-05	51
2008-06	53
2008-07	54
2008-08	56
2008-09	57
2008-10	59
2008-11	64
2008-12	70
2009-01	77
2009-02	81
2009-03	84
2009-04	86
2009-05	86
2009-06	87
2009-07	91
2009-08	91
2009-09	92
2009-10	95
2009-11	97
2009-12	100
2010-01	103
2010-02	103
2010-03	114
2010-04	120
2010-05	129
2010-06	147
2010-07	161
2010-08	185
2010-09	195
2010-10	203
2010-11	209
2010-12	216
2011-01	218
2011-02	226
2011-03	234
2011-04	239
2011-05	246
2011-06	251
2011-07	257
2011-08	263
2011-09	277
2011-10	286
2011-11	289
2011-12	292
2012-01	297
2012-02	305
2012-03	315
2012-04	332
2012-05	355
2012-06	363
2012-07	368
2012-08	372
2012-09	384
2012-10	389
2012-11	398
2012-12	402
2013-01	416
2013-02	426
2013-03	435
2013-04	438
2013-05	447
2013-06	459
2013-07	486
2013-08	489
2013-09	494
2013-10	506
2013-11	511
2013-12	511
2014-01	516
2014-02	519
2014-03	527
2014-04	537
2014-05	545
2014-06	551
2014-07	551

Open in new window

Can you tell me what I've disturbed?
0
 
LVL 35

Expert Comment

by:Robert Schutt
ID: 40206376
Try putting the added criterium (AND purchased = 'True') 1 line up, in the join instead of the where.
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
ID: 40206767
Robert is correct, and I would do it that way also.

ANY where condition you now place on dbo.station could have the effect of causing an inner join.

This is true of any outer join.

An alternative approach is this:

WHERE t.tally BETWEEN 0 AND DATEDIFF(MONTH, @starting_month, @ending_month)
AND ( c.purchased = 'True' OR c.purchased IS NULL )

You have to allow for the left join to produce a NULL (no joined record) so that the months prior to the first purchase can be listed.
0
 

Author Comment

by:petekipe
ID: 40206895
Paul and Robert, It's in and working perfectly. Thank you both again for all of your help. As an old time GUI guy, I have newfound respect for you SQL propeller-heads...
0

Featured Post

IT, Stop Being Called Into Every Meeting

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

Join & Write a Comment

In this article—a derivative of my DaytaBase.org blog post (http://daytabase.org/2011/06/18/what-week-is-it/)—I will explore a few different perspectives on which week today's date falls within using Microsoft SQL Server. First, to frame this stu…
I wrote this interesting script that really help me find jobs or procedures when working in a huge environment. I could I have written it as a Procedure but then I would have to have it on each machine or have a link to a server-related search that …
Familiarize people with the process of retrieving data from SQL Server using an Access pass-thru query. Microsoft Access is a very powerful client/server development tool. One of the ways that you can retrieve data from a SQL Server is by using a pa…
Via a live example, show how to set up a backup for SQL Server using a Maintenance Plan and how to schedule the job into SQL Server Agent.

743 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

11 Experts available now in Live!

Get 1:1 Help Now