Solved

Date Ranges/Case Statement

Posted on 2016-07-20
14
52 Views
Last Modified: 2016-07-28
I had posted this before and the solution I got back worked great (CTE's and Recursion) if I only ran it for a member at a time. With dates going back to 1994 and out as far as 2199 (which I adjusted using a case statement) it still was creating 16 billion rows for the date ranges which is not feasible for our databases.  I need to find a solution using a Case statement or another process that doesn't chew up all the memory or the tempdb.

Again here is my Current case statement and what I am getting as results, along with the sample data:

SELECT ENROLL.MBR_NBR,
            CASE
        WHEN PCP.PCP_EFCTV_DT < ENROLL.ENR_START THEN ENROLL.ENR_START
        WHEN PCP.PCP_EFCTV_DT IS NULL THEN ENROLL.ENR_START
        WHEN PCP.PCP_EFCTV_DT > ENROLL.ENR_END THEN ENROLL.ENR_START
        ELSE PCP.PCP_EFCTV_DT
    END AS 'EffectiveDate',                                                                                
    CASE
        WHEN PCP.PCP_TRMNTN_DT > ENROLL.ENR_END THEN ENROLL.ENR_END
        WHEN PCP.PCP_TRMNTN_DT IS NULL THEN ENROLL.ENR_END
        WHEN PCP.PCP_TRMNTN_DT < ENROLL.ENR_START THEN ENROLL.ENR_END
        ELSE PCP.PCP_TRMNTN_DT
    END AS 'TerminationDate',
      CASE
         WHEN PCP.[PCP_PROV_NBR] = 'NOPCP' then ''
       ELSE PCP.[PCP_PROV_NBR]
       END as PCP_PROV_NBR
                              
                                                                                                                                                  
FROM dbo.Blue_Raw_MCaid_Member_Enrollment  ENROLL left join dbo.Blue_Raw_MCaid_Member_Enrollment_PCP  PCP ON ENROLL.MBR_NBR=PCP.MBR_NBR
                                          AND (PCP.PCP_EFCTV_DT  BETWEEN ENROLL.ENR_START
                                          AND (CASE WHEN  ENROLL.ENR_END = '8888-12-31'
                                                        THEN '9999-12-31'  
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR PCP.PCP_TRMNTN_DT  BETWEEN ENROLL.ENR_START
                                          AND (CASE WHEN ENROLL.ENR_END  = '8888-12-31'
                                                        THEN '9999-12-31'
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR (PCP.PCP_EFCTV_DT  BETWEEN ENROLL.ENR_END
                                          AND (CASE WHEN ENROLL.ENR_END  = '8888-12-31'
                                                        THEN '9999-12-31'
                                                        ELSE ENROLL.ENR_END  
                                                        END)
                                          OR (PCP.PCP_EFCTV_DT  < ENROLL.ENR_START  and PCP.PCP_TRMNTN_DT  > ENROLL.ENR_END)                                          
                                                                        OR PCP.PCP_EFCTV_DT  IS NULL))



Result Set:  Unique Members is 2.7 million .  I need to be able to load this data to a table in the database which I can do using SSIS.

RESULT SET FROM QUERY                        
MBR_NBR      EffectiveDate         TerminationDate      PCP_PROV_NBR         Outcome
123456              1/1/2012               9/30/2013                       NULL                     Good Row
123456              11/1/2013             1/31/2014                       NULL                     This row is missing
123456               2/1/2014               9/30/2015                       ABCD                     Good Row
123456              10/1/2015             10/31/2015                     ABCD                     Good Row
123456              11/1/2015              12/31/8888                    WXYZ                     Good Row

Here is the test Data


Blue_Raw_MCaid_Member_Enrollment  (This table has 7.5 million rows)

MEM_NBR      ENR_START      ENR_END      PCP_PROV_NBR
123456               1/1/2012      9/30/2013      UNKNOWN
123456               11/1/2013      9/30/2015      UNKNOWN
123456               10/1/2015      12/31/8888      UNKNOWN

Blue_Raw_MCaid_Member_Enrollment_PCP   (This table has 5.8 million rows)


MBR_NBR      PCP_PROV_NBR      PCP_EFCTV_DT      PCP_TRMNTN_DT
123456                      ABCD               2/1/2014              10/31/2015
123456                      WXYZ             11/1/2015              12/31/8888
0
Comment
Question by:knamc99
  • 7
  • 7
14 Comments
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Do dates actually NEED to reach  12/31/8888  or beyond?

9999-12-31 is a few thousand years away and my guess any memberships now current won't be exercised at that date
0
 

Author Comment

by:knamc99
Comment Utility
no,  I am changing the case to 2016, and when I do the final load I will case them back
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Ok. Thanks.

So at this point I'm unclear what you need. Are you looking for greater efficiency but the same result? If not that, what?

Is this is a one off activity or something you need regularly?

Btw
Is there an expected result from your sample data?
0
 

Author Comment

by:knamc99
Comment Utility
This something I will need to run monthly.  using case runs effeciently,  using the CTE and Recusion is not.  

Here is the expected result:

RESULT SET FROM QUERY                        
MBR_NBR      EffectiveDate         TerminationDate      PCP_PROV_NBR         Outcome
123456              1/1/2012               9/30/2013                       NULL                     Good Row
123456              11/1/2013             1/31/2014                       NULL                     This row is missing
123456               2/1/2014               9/30/2015                       ABCD                     Good Row
123456              10/1/2015             10/31/2015                     ABCD                     Good Row
123456              11/1/2015              12/31/8888                    WXYZ                     Good Row

The second line is missing from my query,  this is what I need help to get.
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
OK. I remember contributing to the earlier questions where I could not get the recursive approach to work, and instead proposed a different approach to building the daily range that would would with less overhead. I suggest you look back at that question as that reply was more complete than this one.

Basically you simply need a table of dates (one row per date).
You could build it recursively and store it
You could build it non-recursively and store it
You could build it recursively and not retain it
You could build it non-recursively not retain it

The point is you could if you have the ability to store it build that table of dates independently and just re-use it when you need the query.

Here is A way to build the date range
;WITH
  mdates as (
        select min(dt1) min_dt, max(dt2) max_dt 
        from (
              select
                      min(ENR_START) dt1
                    , max(case when ENR_END < '8888-12-31' then ENR_END end)   dt2
              from Blue_Raw_MCaid_Member_Enrollment
              union all
              select
                      min(PCP_EFCTV_DT) 
                    , max(case when PCP_TRMNTN_DT < '8888-12-31' then PCP_TRMNTN_DT end)   
              from Blue_Raw_MCaid_Member_Enrollment_PCP
              ) x
    )
, cteDigits AS (
      SELECT 0 AS digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL
      SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9
      )
, cteTally AS (
      SELECT 
              n1s.digit 
            + n10s.digit * 10
            + n100s.digit * 100  /* add more like this as needed */
            + n1000s.digit * 1000  /* add more like this as needed */
            + n10000s.digit * 10000  /* add more like this as needed */
            AS num
      FROM cteDigits n1s
      CROSS JOIN cteDigits n10s
      CROSS JOIN cteDigits n100s /* add more like this as needed */
      CROSS JOIN cteDigits n1000s /* add more like this as needed */
      CROSS JOIN cteDigits n10000s /* add more like this as needed */
      )
select
       e.MBR_NBR
from (
        select 
            dateadd(day,c.num,d.min_dt) dt, c.num
        from mdates d
        inner join cteTally as c on dateadd(day,c.num,d.min_dt)  <= d.max_dt
     ) r
left join Blue_Raw_MCaid_Member_Enrollment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enrollment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
                                            and e.MBR_NBR = p.MBR_NBR

Open in new window

That query is NOT tested I'm afraid, nor regrettably do I have time for much more at the moment
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
By the way if you did retain a table of dates use a clustered index on the date column. That should aid efficiency for your query.
0
 

Author Comment

by:knamc99
Comment Utility
Yes I did use your solution without the recursion,  but the date range table is creating 16 billion rows!
0
Enabling OSINT in Activity Based Intelligence

Activity based intelligence (ABI) requires access to all available sources of data. Recorded Future allows analysts to observe structured data on the open, deep, and dark web.

 
LVL 48

Accepted Solution

by:
PortletPaul earned 500 total points
Comment Utility
I did not make myself clear it seems.

Create a table of dates.  One row per date. Use a clusteted index on the date column. The dates could span say 100 years in total and that would be only approx 36,525 rows.

Now, substitute that table into your working query and that query should be faster.
0
 

Author Comment

by:knamc99
Comment Utility
Use this table of dates instead of the mdates table?
0
 

Author Comment

by:knamc99
Comment Utility
I ran the Estimated Query plan on this SQL above and here are the estimated rows.  See attached file
Estimated-Row-Size.docx
0
 

Author Comment

by:knamc99
Comment Utility
The mdates query is putting min_dt at '01/01/1994,  the max_dt is '12/31/8888'.  that is what is wanting to create billions of rows.  How do I put the current year date to create the dates,  but this query wants to create billions of rows:

select
       e.MBR_NBR
from (
        select
            dateadd(day,c.num,d.min_dt) dt, c.num
        from mdates d
        inner join cteTally as c on dateadd(day,c.num,d.min_dt)  <= d.max_dt
     ) r
left join Blue_Raw_MCaid_Member_Enrollment e on r.dt between e.ENR_START and e.ENR_END
left join Blue_Raw_MCaid_Member_Enrollment_PCP p on r.dt between p.PCP_EFCTV_DT and p.PCP_TRMNTN_DT
                                            and e.MBR_NBR = p.MBR_NBR
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
sigh...

it was a mistiake for me to add the last 2 rows of code which were meant as EXAMPLES ONLY

to make your existng query faster, remove the recursion and replace it with a real table that is indexed properly. That table does not need to extend to 12/31/8888 or the year 9999, it just needs to extend to a maximum of dates after you exclude 12/31/8888

I suggest you probably only need a a decade or so into the future and you could add an extra year after Jan 1 2017 etc.

MY suggestion is ONLY about creating the table of dates.

Sorry if I have not been absolutely clear. Hopefully I am now.
0
 

Author Closing Comment

by:knamc99
Comment Utility
Created the table of dates and works great!  Thank you
0
 
LVL 48

Expert Comment

by:PortletPaul
Comment Utility
Excellent, that is good news. Thank you.
0

Featured Post

Comprehensive Backup Solutions for Microsoft

Acronis protects the complete Microsoft technology stack: Windows Server, Windows PC, laptop and Surface data; Microsoft business applications; Microsoft Hyper-V; Azure VMs; Microsoft Windows Server 2016; Microsoft Exchange 2016 and SQL Server 2016.

Join & Write a Comment

'Between' is such a common word we rarely think about it but in SQL it has a very specific definition we should be aware of. While most database vendors will have their own unique phrases to describe it (see references at end) the concept in common …
This article explains how to reset the password of the sa account on a Microsoft SQL Server.  The steps in this article work in SQL 2005, 2008, 2008 R2, 2012, 2014 and 2016.
In this seventh video of the Xpdf series, we discuss and demonstrate the PDFfonts utility, which lists all the fonts used in a PDF file. It does this via a command line interface, making it suitable for use in programs, scripts, batch files — any pl…
Access reports are powerful and flexible. Learn how to create a query and then a grouped report using the wizard. Modify the report design after the wizard is done to make it look better. There will be another video to explain how to put the final p…

763 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

7 Experts available now in Live!

Get 1:1 Help Now