knamc99
asked on
Date Ranges/Case Statement
I had posted this before and the solution I got back worked great (CTE's and Recursion) if I only ran it for a member at a time. With dates going back to 1994 and out as far as 2199 (which I adjusted using a case statement) it still was creating 16 billion rows for the date ranges which is not feasible for our databases. I need to find a solution using a Case statement or another process that doesn't chew up all the memory or the tempdb.
Again here is my Current case statement and what I am getting as results, along with the sample data:
SELECT ENROLL.MBR_NBR,
CASE
WHEN PCP.PCP_EFCTV_DT < ENROLL.ENR_START THEN ENROLL.ENR_START
WHEN PCP.PCP_EFCTV_DT IS NULL THEN ENROLL.ENR_START
WHEN PCP.PCP_EFCTV_DT > ENROLL.ENR_END THEN ENROLL.ENR_START
ELSE PCP.PCP_EFCTV_DT
END AS 'EffectiveDate',
CASE
WHEN PCP.PCP_TRMNTN_DT > ENROLL.ENR_END THEN ENROLL.ENR_END
WHEN PCP.PCP_TRMNTN_DT IS NULL THEN ENROLL.ENR_END
WHEN PCP.PCP_TRMNTN_DT < ENROLL.ENR_START THEN ENROLL.ENR_END
ELSE PCP.PCP_TRMNTN_DT
END AS 'TerminationDate',
CASE
WHEN PCP.[PCP_PROV_NBR] = 'NOPCP' then ''
ELSE PCP.[PCP_PROV_NBR]
END as PCP_PROV_NBR
FROM dbo.Blue_Raw_MCaid_Member_ Enrollment ENROLL left join dbo.Blue_Raw_MCaid_Member_ Enrollment _PCP PCP ON ENROLL.MBR_NBR=PCP.MBR_NBR
AND (PCP.PCP_EFCTV_DT BETWEEN ENROLL.ENR_START
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR PCP.PCP_TRMNTN_DT BETWEEN ENROLL.ENR_START
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR (PCP.PCP_EFCTV_DT BETWEEN ENROLL.ENR_END
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR (PCP.PCP_EFCTV_DT < ENROLL.ENR_START and PCP.PCP_TRMNTN_DT > ENROLL.ENR_END)
OR PCP.PCP_EFCTV_DT IS NULL))
Result Set: Unique Members is 2.7 million . I need to be able to load this data to a table in the database which I can do using SSIS.
RESULT SET FROM QUERY
MBR_NBR EffectiveDate TerminationDate PCP_PROV_NBR Outcome
123456 1/1/2012 9/30/2013 NULL Good Row
123456 11/1/2013 1/31/2014 NULL This row is missing
123456 2/1/2014 9/30/2015 ABCD Good Row
123456 10/1/2015 10/31/2015 ABCD Good Row
123456 11/1/2015 12/31/8888 WXYZ Good Row
Here is the test Data
Blue_Raw_MCaid_Member_Enro llment (This table has 7.5 million rows)
MEM_NBR ENR_START ENR_END PCP_PROV_NBR
123456 1/1/2012 9/30/2013 UNKNOWN
123456 11/1/2013 9/30/2015 UNKNOWN
123456 10/1/2015 12/31/8888 UNKNOWN
Blue_Raw_MCaid_Member_Enro llment_PCP (This table has 5.8 million rows)
MBR_NBR PCP_PROV_NBR PCP_EFCTV_DT PCP_TRMNTN_DT
123456 ABCD 2/1/2014 10/31/2015
123456 WXYZ 11/1/2015 12/31/8888
Again here is my Current case statement and what I am getting as results, along with the sample data:
SELECT ENROLL.MBR_NBR,
CASE
WHEN PCP.PCP_EFCTV_DT < ENROLL.ENR_START THEN ENROLL.ENR_START
WHEN PCP.PCP_EFCTV_DT IS NULL THEN ENROLL.ENR_START
WHEN PCP.PCP_EFCTV_DT > ENROLL.ENR_END THEN ENROLL.ENR_START
ELSE PCP.PCP_EFCTV_DT
END AS 'EffectiveDate',
CASE
WHEN PCP.PCP_TRMNTN_DT > ENROLL.ENR_END THEN ENROLL.ENR_END
WHEN PCP.PCP_TRMNTN_DT IS NULL THEN ENROLL.ENR_END
WHEN PCP.PCP_TRMNTN_DT < ENROLL.ENR_START THEN ENROLL.ENR_END
ELSE PCP.PCP_TRMNTN_DT
END AS 'TerminationDate',
CASE
WHEN PCP.[PCP_PROV_NBR] = 'NOPCP' then ''
ELSE PCP.[PCP_PROV_NBR]
END as PCP_PROV_NBR
FROM dbo.Blue_Raw_MCaid_Member_
AND (PCP.PCP_EFCTV_DT BETWEEN ENROLL.ENR_START
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR PCP.PCP_TRMNTN_DT BETWEEN ENROLL.ENR_START
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR (PCP.PCP_EFCTV_DT BETWEEN ENROLL.ENR_END
AND (CASE WHEN ENROLL.ENR_END = '8888-12-31'
THEN '9999-12-31'
ELSE ENROLL.ENR_END
END)
OR (PCP.PCP_EFCTV_DT < ENROLL.ENR_START and PCP.PCP_TRMNTN_DT > ENROLL.ENR_END)
OR PCP.PCP_EFCTV_DT IS NULL))
Result Set: Unique Members is 2.7 million . I need to be able to load this data to a table in the database which I can do using SSIS.
RESULT SET FROM QUERY
MBR_NBR EffectiveDate TerminationDate PCP_PROV_NBR Outcome
123456 1/1/2012 9/30/2013 NULL Good Row
123456 11/1/2013 1/31/2014 NULL This row is missing
123456 2/1/2014 9/30/2015 ABCD Good Row
123456 10/1/2015 10/31/2015 ABCD Good Row
123456 11/1/2015 12/31/8888 WXYZ Good Row
Here is the test Data
Blue_Raw_MCaid_Member_Enro
MEM_NBR ENR_START ENR_END PCP_PROV_NBR
123456 1/1/2012 9/30/2013 UNKNOWN
123456 11/1/2013 9/30/2015 UNKNOWN
123456 10/1/2015 12/31/8888 UNKNOWN
Blue_Raw_MCaid_Member_Enro
MBR_NBR PCP_PROV_NBR PCP_EFCTV_DT PCP_TRMNTN_DT
123456 ABCD 2/1/2014 10/31/2015
123456 WXYZ 11/1/2015 12/31/8888
ASKER
no, I am changing the case to 2016, and when I do the final load I will case them back
Ok. Thanks.
So at this point I'm unclear what you need. Are you looking for greater efficiency but the same result? If not that, what?
Is this is a one off activity or something you need regularly?
Btw
Is there an expected result from your sample data?
So at this point I'm unclear what you need. Are you looking for greater efficiency but the same result? If not that, what?
Is this is a one off activity or something you need regularly?
Btw
Is there an expected result from your sample data?
ASKER
This something I will need to run monthly. using case runs effeciently, using the CTE and Recusion is not.
Here is the expected result:
RESULT SET FROM QUERY
MBR_NBR EffectiveDate TerminationDate PCP_PROV_NBR Outcome
123456 1/1/2012 9/30/2013 NULL Good Row
123456 11/1/2013 1/31/2014 NULL This row is missing
123456 2/1/2014 9/30/2015 ABCD Good Row
123456 10/1/2015 10/31/2015 ABCD Good Row
123456 11/1/2015 12/31/8888 WXYZ Good Row
The second line is missing from my query, this is what I need help to get.
Here is the expected result:
RESULT SET FROM QUERY
MBR_NBR EffectiveDate TerminationDate PCP_PROV_NBR Outcome
123456 1/1/2012 9/30/2013 NULL Good Row
123456 11/1/2013 1/31/2014 NULL This row is missing
123456 2/1/2014 9/30/2015 ABCD Good Row
123456 10/1/2015 10/31/2015 ABCD Good Row
123456 11/1/2015 12/31/8888 WXYZ Good Row
The second line is missing from my query, this is what I need help to get.
OK. I remember contributing to the earlier questions where I could not get the recursive approach to work, and instead proposed a different approach to building the daily range that would would with less overhead. I suggest you look back at that question as that reply was more complete than this one.
Basically you simply need a table of dates (one row per date).
You could build it recursively and store it
You could build it non-recursively and store it
You could build it recursively and not retain it
You could build it non-recursively not retain it
The point is you could if you have the ability to store it build that table of dates independently and just re-use it when you need the query.
Here is A way to build the date range
Basically you simply need a table of dates (one row per date).
You could build it recursively and store it
You could build it non-recursively and store it
You could build it recursively and not retain it
You could build it non-recursively not retain it
The point is you could if you have the ability to store it build that table of dates independently and just re-use it when you need the query.
Here is A way to build the date range
;WITH
mdates as (
select min(dt1) min_dt, max(dt2) max_dt
from (
select
min(ENR_START) dt1
, max(case when ENR_END < '8888-12-31' then ENR_END end) dt2
from Blue_Raw_MCaid_Member_Enrollment
union all
select
min(PCP_EFCTV_DT)
, max(case when PCP_TRMNTN_DT < '8888-12-31' then PCP_TRMNTN_DT end)
from Blue_Raw_MCaid_Member_Enrollment_PCP
) x
)
, cteDigits AS (
SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
)
, cteTally AS (
SELECT
n1s.digit
+ n10s.digit * 10
+ n100s.digit * 100 /* add more like this as needed */
+ n1000s.digit * 1000 /* add more like this as needed */
+ n10000s.digit * 10000 /* add more like this as needed */
AS num
FROM cteDigits n1s
CROSS JOIN cteDigits n10s
CROSS JOIN cteDigits n100s /* add more like this as needed */
CROSS JOIN cteDigits n1000s /* add more like this as needed */
CROSS JOIN cteDigits n10000s /* add more like this as needed */
)
select
e.MBR_NBR
from (
select
dateadd(day,c.num,d.min_dt) dt, c.num
from mdates d
inner join cteTally as c on dateadd(day,c.num,d.min_dt) <= d.max_dt
) r
left join Blue_Raw_MCaid_Member_Enrollment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enrollment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
and e.MBR_NBR = p.MBR_NBR
That query is NOT tested I'm afraid, nor regrettably do I have time for much more at the moment
By the way if you did retain a table of dates use a clustered index on the date column. That should aid efficiency for your query.
ASKER
Yes I did use your solution without the recursion, but the date range table is creating 16 billion rows!
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Use this table of dates instead of the mdates table?
ASKER
I ran the Estimated Query plan on this SQL above and here are the estimated rows. See attached file
Estimated-Row-Size.docx
Estimated-Row-Size.docx
ASKER
The mdates query is putting min_dt at '01/01/1994, the max_dt is '12/31/8888'. that is what is wanting to create billions of rows. How do I put the current year date to create the dates, but this query wants to create billions of rows:
select
e.MBR_NBR
from (
select
dateadd(day,c.num,d.min_dt ) dt, c.num
from mdates d
inner join cteTally as c on dateadd(day,c.num,d.min_dt ) <= d.max_dt
) r
left join Blue_Raw_MCaid_Member_Enro llment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enro llment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
and e.MBR_NBR = p.MBR_NBR
select
e.MBR_NBR
from (
select
dateadd(day,c.num,d.min_dt
from mdates d
inner join cteTally as c on dateadd(day,c.num,d.min_dt
) r
left join Blue_Raw_MCaid_Member_Enro
left join Blue_Raw_MCaid_Member_Enro
and e.MBR_NBR = p.MBR_NBR
sigh...
it was a mistiake for me to add the last 2 rows of code which were meant as EXAMPLES ONLY
to make your existng query faster, remove the recursion and replace it with a real table that is indexed properly. That table does not need to extend to 12/31/8888 or the year 9999, it just needs to extend to a maximum of dates after you exclude 12/31/8888
I suggest you probably only need a a decade or so into the future and you could add an extra year after Jan 1 2017 etc.
MY suggestion is ONLY about creating the table of dates.
Sorry if I have not been absolutely clear. Hopefully I am now.
it was a mistiake for me to add the last 2 rows of code which were meant as EXAMPLES ONLY
to make your existng query faster, remove the recursion and replace it with a real table that is indexed properly. That table does not need to extend to 12/31/8888 or the year 9999, it just needs to extend to a maximum of dates after you exclude 12/31/8888
I suggest you probably only need a a decade or so into the future and you could add an extra year after Jan 1 2017 etc.
MY suggestion is ONLY about creating the table of dates.
Sorry if I have not been absolutely clear. Hopefully I am now.
ASKER
Created the table of dates and works great! Thank you
Excellent, that is good news. Thank you.
9999-12-31 is a few thousand years away and my guess any memberships now current won't be exercised at that date