Link to home
Start Free TrialLog in
Avatar of knamc99
knamc99

asked on

Date Ranges/Case Statement

I had posted this before and the solution I got back worked great (CTE's and Recursion) if I only ran it for a member at a time. With dates going back to 1994 and out as far as 2199 (which I adjusted using a case statement) it still was creating 16 billion rows for the date ranges which is not feasible for our databases.  I need to find a solution using a Case statement or another process that doesn't chew up all the memory or the tempdb.

Again here is my Current case statement and what I am getting as results, along with the sample data:

SELECT ENROLL.MBR_NBR,
            CASE
        WHEN PCP.PCP_EFCTV_DT < ENROLL.ENR_START THEN ENROLL.ENR_START
        WHEN PCP.PCP_EFCTV_DT IS NULL THEN ENROLL.ENR_START
        WHEN PCP.PCP_EFCTV_DT > ENROLL.ENR_END THEN ENROLL.ENR_START
        ELSE PCP.PCP_EFCTV_DT
    END AS 'EffectiveDate',                                                                                
    CASE
        WHEN PCP.PCP_TRMNTN_DT > ENROLL.ENR_END THEN ENROLL.ENR_END
        WHEN PCP.PCP_TRMNTN_DT IS NULL THEN ENROLL.ENR_END
        WHEN PCP.PCP_TRMNTN_DT < ENROLL.ENR_START THEN ENROLL.ENR_END
        ELSE PCP.PCP_TRMNTN_DT
    END AS 'TerminationDate',
      CASE
         WHEN PCP.[PCP_PROV_NBR] = 'NOPCP' then ''
       ELSE PCP.[PCP_PROV_NBR]
       END as PCP_PROV_NBR
                              
                                                                                                                                                  
FROM dbo.Blue_Raw_MCaid_Member_Enrollment  ENROLL left join dbo.Blue_Raw_MCaid_Member_Enrollment_PCP  PCP ON ENROLL.MBR_NBR=PCP.MBR_NBR
                                          AND (PCP.PCP_EFCTV_DT  BETWEEN ENROLL.ENR_START
                                          AND (CASE WHEN  ENROLL.ENR_END = '8888-12-31'
                                                        THEN '9999-12-31'  
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR PCP.PCP_TRMNTN_DT  BETWEEN ENROLL.ENR_START
                                          AND (CASE WHEN ENROLL.ENR_END  = '8888-12-31'
                                                        THEN '9999-12-31'
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR (PCP.PCP_EFCTV_DT  BETWEEN ENROLL.ENR_END
                                          AND (CASE WHEN ENROLL.ENR_END  = '8888-12-31'
                                                        THEN '9999-12-31'
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR (PCP.PCP_EFCTV_DT  < ENROLL.ENR_START  and PCP.PCP_TRMNTN_DT  > ENROLL.ENR_END)                                          
                                                                        OR PCP.PCP_EFCTV_DT  IS NULL))



Result Set:  Unique Members is 2.7 million .  I need to be able to load this data to a table in the database which I can do using SSIS.

RESULT SET FROM QUERY                        
MBR_NBR      EffectiveDate         TerminationDate      PCP_PROV_NBR         Outcome
123456              1/1/2012               9/30/2013                       NULL                     Good Row
123456              11/1/2013             1/31/2014                       NULL                     This row is missing
123456               2/1/2014               9/30/2015                       ABCD                     Good Row
123456              10/1/2015             10/31/2015                     ABCD                     Good Row
123456              11/1/2015              12/31/8888                    WXYZ                     Good Row

Here is the test Data


Blue_Raw_MCaid_Member_Enrollment  (This table has 7.5 million rows)

MEM_NBR      ENR_START      ENR_END      PCP_PROV_NBR
123456               1/1/2012      9/30/2013      UNKNOWN
123456               11/1/2013      9/30/2015      UNKNOWN
123456               10/1/2015      12/31/8888      UNKNOWN

Blue_Raw_MCaid_Member_Enrollment_PCP   (This table has 5.8 million rows)


MBR_NBR      PCP_PROV_NBR      PCP_EFCTV_DT      PCP_TRMNTN_DT
123456                      ABCD               2/1/2014              10/31/2015
123456                      WXYZ             11/1/2015              12/31/8888
Avatar of PortletPaul
PortletPaul
Flag of Australia image

Do dates actually NEED to reach  12/31/8888  or beyond?

9999-12-31 is a few thousand years away and my guess any memberships now current won't be exercised at that date
Avatar of knamc99
knamc99

ASKER

no,  I am changing the case to 2016, and when I do the final load I will case them back
Ok. Thanks.

So at this point I'm unclear what you need. Are you looking for greater efficiency but the same result? If not that, what?

Is this is a one off activity or something you need regularly?

Btw
Is there an expected result from your sample data?
Avatar of knamc99

ASKER

This something I will need to run monthly.  using case runs effeciently,  using the CTE and Recusion is not.  

Here is the expected result:

RESULT SET FROM QUERY                        
MBR_NBR      EffectiveDate         TerminationDate      PCP_PROV_NBR         Outcome
123456              1/1/2012               9/30/2013                       NULL                     Good Row
123456              11/1/2013             1/31/2014                       NULL                     This row is missing
123456               2/1/2014               9/30/2015                       ABCD                     Good Row
123456              10/1/2015             10/31/2015                     ABCD                     Good Row
123456              11/1/2015              12/31/8888                    WXYZ                     Good Row

The second line is missing from my query,  this is what I need help to get.
OK. I remember contributing to the earlier questions where I could not get the recursive approach to work, and instead proposed a different approach to building the daily range that would would with less overhead. I suggest you look back at that question as that reply was more complete than this one.

Basically you simply need a table of dates (one row per date).
You could build it recursively and store it
You could build it non-recursively and store it
You could build it recursively and not retain it
You could build it non-recursively not retain it

The point is you could if you have the ability to store it build that table of dates independently and just re-use it when you need the query.

Here is A way to build the date range
;WITH
  mdates as (
        select min(dt1) min_dt, max(dt2) max_dt 
        from (
              select
                      min(ENR_START) dt1
                    , max(case when ENR_END < '8888-12-31' then ENR_END end)   dt2
              from Blue_Raw_MCaid_Member_Enrollment
              union all
              select
                      min(PCP_EFCTV_DT) 
                    , max(case when PCP_TRMNTN_DT < '8888-12-31' then PCP_TRMNTN_DT end)   
              from Blue_Raw_MCaid_Member_Enrollment_PCP
              ) x
    )
, cteDigits AS (
      SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
      SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
      )
, cteTally AS (
      SELECT 
              n1s.digit 
            + n10s.digit * 10
            + n100s.digit * 100  /* add more like this as needed */
            + n1000s.digit * 1000  /* add more like this as needed */
            + n10000s.digit * 10000  /* add more like this as needed */
            AS num
      FROM cteDigits n1s
      CROSS JOIN cteDigits n10s
      CROSS JOIN cteDigits n100s /* add more like this as needed */
      CROSS JOIN cteDigits n1000s /* add more like this as needed */
      CROSS JOIN cteDigits n10000s /* add more like this as needed */
      )
select
       e.MBR_NBR
from (
        select 
            dateadd(day,c.num,d.min_dt) dt, c.num
        from mdates d
        inner join cteTally as c on dateadd(day,c.num,d.min_dt)  <= d.max_dt
     ) r
left join Blue_Raw_MCaid_Member_Enrollment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enrollment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
                                            and e.MBR_NBR = p.MBR_NBR

Open in new window

That query is NOT tested I'm afraid, nor regrettably do I have time for much more at the moment
By the way if you did retain a table of dates use a clustered index on the date column. That should aid efficiency for your query.
Avatar of knamc99

ASKER

Yes I did use your solution without the recursion,  but the date range table is creating 16 billion rows!
ASKER CERTIFIED SOLUTION
Avatar of PortletPaul
PortletPaul
Flag of Australia image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
Avatar of knamc99

ASKER

Use this table of dates instead of the mdates table?
Avatar of knamc99

ASKER

I ran the Estimated Query plan on this SQL above and here are the estimated rows.  See attached file
Estimated-Row-Size.docx
Avatar of knamc99

ASKER

The mdates query is putting min_dt at '01/01/1994,  the max_dt is '12/31/8888'.  that is what is wanting to create billions of rows.  How do I put the current year date to create the dates,  but this query wants to create billions of rows:

select
       e.MBR_NBR
from (
        select
            dateadd(day,c.num,d.min_dt) dt, c.num
        from mdates d
        inner join cteTally as c on dateadd(day,c.num,d.min_dt)  <= d.max_dt
     ) r
left join Blue_Raw_MCaid_Member_Enrollment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enrollment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
                                            and e.MBR_NBR = p.MBR_NBR
sigh...

it was a mistiake for me to add the last 2 rows of code which were meant as EXAMPLES ONLY

to make your existng query faster, remove the recursion and replace it with a real table that is indexed properly. That table does not need to extend to 12/31/8888 or the year 9999, it just needs to extend to a maximum of dates after you exclude 12/31/8888

I suggest you probably only need a a decade or so into the future and you could add an extra year after Jan 1 2017 etc.

MY suggestion is ONLY about creating the table of dates.

Sorry if I have not been absolutely clear. Hopefully I am now.
Avatar of knamc99

ASKER

Created the table of dates and works great!  Thank you
Excellent, that is good news. Thank you.