rwaterz
asked on
How do I pull data based on consecutive purchase criteria
Hi,
We would like to find customers who have made 10 consecutive year purchases with us. The date range wouldn't be limited. Meaning, we are looking between the years of 2000 and 2015, so not limited to the past 10 years only.
A 10 year consecutive purchase would mean the patron purchased tickets each year between 2002-2011 for example. If a patron purchased each year between 2002-2007 (5 years), skipped 2008 and purchased each year between 2009-2013, (total of 11 years in length but two sets of 5 years) we would not want to include that person.
I hope that makes sense. Thoughts?
We would like to find customers who have made 10 consecutive year purchases with us. The date range wouldn't be limited. Meaning, we are looking between the years of 2000 and 2015, so not limited to the past 10 years only.
A 10 year consecutive purchase would mean the patron purchased tickets each year between 2002-2011 for example. If a patron purchased each year between 2002-2007 (5 years), skipped 2008 and purchased each year between 2009-2013, (total of 11 years in length but two sets of 5 years) we would not want to include that person.
I hope that makes sense. Thoughts?
This is going to require more than air code so please put some time into providing schema scripts, sample data, and expected output.
ASKER
Hi Brian - Thanks for the quick reply, happy to provide more info. Small query below has two tables and image directly below that is sample data it provides; LT_SUB_HIST which has the customer ticket purchase info and TR_SEASON which I am filtering on.
Desired output would be simple:
Notice in the sample data a field named "location". This field has (mainly) three possible records: "New", "Renew", or "Recapture {fiscal year}". New means absolute first year as a subscriber. Renew means they were either "New" or "Recapture" the year prior. Recapture{fiscal year} means they renewed but not directly from the year prior, thus the fiscal year of their last activity is present.
After writing this I realize it will be incredibly unlikely that anyone who was a recapture at any point won't fit the 10 consecutive year data pull were looking for.
select
sh.*
,s.id
,s.description
,s.type
,s.start_dt
,s.end_dt
,s.fyear
,s.inactive
from
LT_SUB_HIST sh
join TR_SEASON s on s.id = sh.season
where
s.type = 1
Desired output would be simple:
Notice in the sample data a field named "location". This field has (mainly) three possible records: "New", "Renew", or "Recapture {fiscal year}". New means absolute first year as a subscriber. Renew means they were either "New" or "Recapture" the year prior. Recapture{fiscal year} means they renewed but not directly from the year prior, thus the fiscal year of their last activity is present.
After writing this I realize it will be incredibly unlikely that anyone who was a recapture at any point won't fit the 10 consecutive year data pull were looking for.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
I have a pretty good understanding of what you are trying to accomplish. What would make this go faster is some useful sample data. A picture of data isn't very helpful. If we are working off the same data then we can bounce back and forth working toward a solution without worrying about the data being the issue.
Obviously don't include anything proprietary or private.
http://www.sqlservercentral.com/articles/Best+Practices/61537/
Obviously don't include anything proprietary or private.
http://www.sqlservercentral.com/articles/Best+Practices/61537/
ASKER
Scott - the generalized format comes extremely close to what I am looking for. There are some slight adjustments I would need to make but this looks like it will work quite well. I am testing this now, more to follow.
ASKER
Scott,
I am not sure I understand the "tally" table or its purpose. Can you explain?
I am not sure I understand the "tally" table or its purpose. Can you explain?
That was just to generate sample data for me to test with.
A tally table is literally simply a table of sequential numbers from 0 to (typically) 1 million.
This is a remarkably useful device for processing many things, including adjusting dates, duplicating rows, finding substrings within a long string, etc..
A tally table is literally simply a table of sequential numbers from 0 to (typically) 1 million.
This is a remarkably useful device for processing many things, including adjusting dates, duplicating rows, finding substrings within a long string, etc..
CREATE TABLE dbo.tally (
tally int NOT NULL,
CONSTRAINT tally__PK PRIMARY KEY CLUSTERED ( tally )
)
INSERT INTO dbo.tally VALUES(0);
;WITH
cteTally10 AS (
SELECT 0 AS tally UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL
SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0
),
cteTally100 AS (
SELECT 1 AS tally
FROM cteTally10 c1
CROSS JOIN cteTally10 c2
),
cteTally10K AS (
SELECT 1 AS tally
FROM cteTally100 c1
CROSS JOIN cteTally100 c2
),
cteTally1Mil AS (
SELECT ROW_NUMBER() OVER(ORDER BY c1.tally) AS tally
FROM cteTally100 c1
CROSS JOIN cteTally10K c2
)
INSERT INTO dbo.tally
SELECT tally
FROM cteTally1Mil
ORDER BY tally
ASKER
Excellent Scott, thank you!
You're welcome! But the original idea/technique was Itzik Ben-Gan's. Btw, ANY/ALL of his books on SQL/TSQL are awesome.
ASKER
That's really great, thanks for sharing!