moe57
asked on
Finding records based on their previous dates and grouping them by location using SQL
Hi,
I have new requirement for this logic and it uses 2 main locations and the locations are categorized into these 2 groups: ER group and Clinic group
• ER Group Locations = ER, OBS and IP
• Clinic Group Locations = Clinic and OP
So the logic is following:
1) ER Group
a) Output this record, if patient makes his/her first visit within 48 hours to any of the ER group locations
b) Output any subsequent visits to any of the ER locations after 48 from their last visit (last visits = last output record)
c) output if same patient makes a visit to any of the Clinic group locations after 90 days from their last visit (last visit = last output record for this patient)
2) Clinic Group
a) Output if patient makes his/her first visit to any of Clinic group locations
b) Do not output any subsequent visits to any of the Clinic group locations for the same patient
c) Output if same patient makes a visit to any of the ER group locations after 48 hours from their last clinic visit (last visit = last output record)
here is some examples for different visits
here is the final output should look like
here is sample data to test it
thank you all for your help.
I have new requirement for this logic and it uses 2 main locations and the locations are categorized into these 2 groups: ER group and Clinic group
• ER Group Locations = ER, OBS and IP
• Clinic Group Locations = Clinic and OP
So the logic is following:
1) ER Group
a) Output this record, if patient makes his/her first visit within 48 hours to any of the ER group locations
b) Output any subsequent visits to any of the ER locations after 48 from their last visit (last visits = last output record)
c) output if same patient makes a visit to any of the Clinic group locations after 90 days from their last visit (last visit = last output record for this patient)
2) Clinic Group
a) Output if patient makes his/her first visit to any of Clinic group locations
b) Do not output any subsequent visits to any of the Clinic group locations for the same patient
c) Output if same patient makes a visit to any of the ER group locations after 48 hours from their last clinic visit (last visit = last output record)
here is some examples for different visits
here is the final output should look like
here is sample data to test it
thank you all for your help.
ASKER
thanks but it is not as easy as that because there are some rules that you have to take into consideration as i have described in my initial post.
ASKER
okay, this is what i have but it is not quite right yet. The problem with this query is that is not applying the 48 and the 90 days rule correctly because we always need to look at the last output record and from that date that is where we start counting if patient visited ER group within 48 hours.
here is the query that just need to be adjusted
here is the query that just need to be adjusted
;with grp as (
SELECT CASE WHEN LOCATION IN ('ER', 'OBS', 'IP') THEN 'ER'
WHEN LOCATION IN('CLINIC', 'OP') THEN 'CLINIC' END GRP, *
FROM TABLE1 )
,grprn AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY [Name] ORDER BY [DATE]) rn, * FROM grp
)
,ERs AS
(
SELECT * FROM grprn WHERE GRP = 'ER'
)
,CLs AS
(
SELECT * FROM grprn WHERE GRP = 'CLINIC'
)
SELECT [Name], [Location], [DATE]
FROM ERs WHERE RN = 1
UNION ALL
SELECT [Name], [Location], [DATE]
FROM ERs a
WHERE RN > 1 AND EXISTS(SELECT 1 FROM ERs WHERE RN = 1 AND [Name] = a.[Name])
AND NOT EXISTS(SELECT 1 FROM ERs WHERE RN = 1 AND [Name] = a.[Name] AND a.[Date] < DateAdd(hh, 48, [Date]))
UNION ALL
SELECT [Name], [Location], [DATE]
FROM CLs b
WHERE RN > 1 AND EXISTS(SELECT 1 FROM ERs WHERE RN = 1 AND [Name] = b.[Name])
AND NOT EXISTS(SELECT 1 FROM ERs WHERE RN = 1 AND [Name] = b.[Name] AND b.[Date] < DateAdd(d, 90, [Date]))
UNION ALL
SELECT [Name], [Location], [DATE]
FROM CLs WHERE RN = 1
UNION ALL
SELECT [Name], [Location], [DATE]
FROM ERs c
WHERE RN > 1 AND EXISTS(SELECT 1 FROM CLs WHERE RN = 1 AND [Name] = c.[Name])
AND NOT EXISTS(SELECT 1 FROM CLs WHERE RN = 1 AND [Name] = c.[Name] AND c.[Date] < DateAdd(hh, 48, [Date]))
ORDER BY 1, 3
>here is some examples for different visits
>here is the final output should look like
>here is sample data to test it
Either my eyesight is going bad, or I don't see any of these samples/attachments.
Also, there is a wompload of business logic here, but without a data diagram that shows your tables and relationships it is very difficult to conceptualize what you are asking.
>here is the final output should look like
>here is sample data to test it
Either my eyesight is going bad, or I don't see any of these samples/attachments.
Also, there is a wompload of business logic here, but without a data diagram that shows your tables and relationships it is very difficult to conceptualize what you are asking.
ASKER
hmmm! you are right but I had details post out there for the last couple of days but it seems that some of my post was deleted this morning and i am not able to even edit it. I am not sure what is going on..
@Jim - Weird, but the sample and final output were there earlier - I copied them into SQL fiddle. Is it possible for the OP to edit the question?
This is the result of what I copied into SQLFiddle to create the schema...
@Moe, I'm thought I was close to solving it. The one part of the scenario that is causing problems is this one:
('MIKE', 'ER', '2014-01-01 00:00:00', 'output it is first visit'),
('MIKE', 'ER', '2014-01-02 00:00:00', 'do not output, it is <48'),
('MIKE', 'OBS', '2014-01-03 10:00:00', 'output this because it is >48 from his last visit on 1/1/2014'),
Note that the third line for MIKE is not actually >48 after his last ER visit (it is the next day), but it IS 48 hours after the most recent ER visit that matched the output rules. I'm struggling to build suitable CTE that can identify these properly. Potentially a patient could go to ER every day, and only every other visit would match the output rule.
@Experts: I'm out of time on this, so don't hesitate to jump in if you're inclined.
This is the result of what I copied into SQLFiddle to create the schema...
CREATE TABLE Table1
([NAME] varchar(5), [LOCATION] varchar(6), [DATE] date, [NOTES] varchar(72))
;
INSERT INTO Table1
([NAME], [LOCATION], [DATE], [NOTES])
VALUES
('MIKE', 'ER', '2014-01-01 00:00:00', 'output it is first visit'),
('MIKE', 'ER', '2014-01-02 00:00:00', 'do not output, it is <48'),
('MIKE', 'OBS', '2014-01-03 10:00:00', 'output this because it is >48 from his last visit on 1/1/2014'),
('MIKE', 'IP', '2014-01-04 00:00:00', 'do not output because it is less than 48 from his last visit on 1/3/2014'),
('MIKE', 'IP', '2014-01-29 00:00:00', 'output this, it is >48 from his last visit on 1/3/2014'),
('MIKE', 'OBS', '2014-04-15 01:00:00', 'output this, it is >48 from his last visit on 1/29/2014'),
('MIKE', 'CLINIC', '2014-05-20 01:00:00', 'do not output, it is <90 from his last visit on 4/15/2014'),
('MIKE', 'OP', '2014-06-20 01:00:00', 'do not output, it is <90 from his last visit on 4/15/2014'),
('MIKE', 'CLINIC', '2014-09-15 01:00:00', 'output, it is >90 from his ER visit on 4/15/2014'),
('DAVID', 'CLINIC', '2014-03-03 00:00:00', 'output it is first visit'),
('DAVID', 'CLINIC', '2014-03-04 00:00:00', 'do not output'),
('DAVID', 'CLINIC', '2014-07-05 01:00:00', 'do not output'),
('DAVID', 'ER', '2014-03-15 00:00:00', 'output this because it is >48 from his last clinic visit on 3/3/2014'),
('DAVID', 'CLINIC', '2014-03-18 00:00:00', 'do not output because it is < 90 from his ER visit on 3/15/2014'),
('DAVID', 'OBS', '2014-03-19 00:00:00', 'output, it is >48 from his last ER visit on 3/15/2014')
;
@Moe, I'm thought I was close to solving it. The one part of the scenario that is causing problems is this one:
('MIKE', 'ER', '2014-01-01 00:00:00', 'output it is first visit'),
('MIKE', 'ER', '2014-01-02 00:00:00', 'do not output, it is <48'),
('MIKE', 'OBS', '2014-01-03 10:00:00', 'output this because it is >48 from his last visit on 1/1/2014'),
Note that the third line for MIKE is not actually >48 after his last ER visit (it is the next day), but it IS 48 hours after the most recent ER visit that matched the output rules. I'm struggling to build suitable CTE that can identify these properly. Potentially a patient could go to ER every day, and only every other visit would match the output rule.
@Experts: I'm out of time on this, so don't hesitate to jump in if you're inclined.
I couldn't give it up so close to the end:
Working example here on SQLFiddle
Returns 8 rows as required. I'm not claiming it's the most elegant way to do it...
It uses 3 CTEs.
1. order by person and identify location groups
2. Compare to date of previous row
3. Generate final result set by removing non-eligible rows
It has some superfluous columns in the CTEs but I'm out of time. Can tune on Monday if necessary.
Excellent link on simple talk for solving complex T-SQL problems step by step.
Working example here on SQLFiddle
Returns 8 rows as required. I'm not claiming it's the most elegant way to do it...
It uses 3 CTEs.
1. order by person and identify location groups
2. Compare to date of previous row
3. Generate final result set by removing non-eligible rows
It has some superfluous columns in the CTEs but I'm out of time. Can tune on Monday if necessary.
;with cte (personID,ItemRank,ItemDate,ItemLocation,LocGroup,Notes)
as
(select
name
,ROW_NUMBER() OVER (partition by name ORDER BY name,date)
,date
,location
,case when location in ('ER', 'OBS' ,'IP') then 'ER' ELSE 'CLINIC' END
,notes
from table1
),
cteFilter (personID,ItemRank,LocRank,ItemDate,PrevDate,UseDate,ItemLocation,LocGroup,Notes) as
(select
a.personID,
ROW_NUMBER() OVER (partition by a.personid ORDER BY a.personid,a.itemdate)
,row_number() over (order by a.locgroup,a.personid)
,a.ItemDate,coalesce(p.itemdate,null)
,case when a.locgroup = 'ER' and a.itemdate <dateadd(hh,48,coalesce(p.itemdate,null)) then p.itemdate
when a.locgroup='CLINIC' and a.itemdate <dateadd(d,90,coalesce(p.itemdate,null)) then p.itemdate
else a.itemdate end as UseDate
,a.ItemLocation
,a.LocGroup,a.Notes
from cte A
left join cte P on a.personID=p.personID and a.Itemrank=p.itemrank+1
),
cteFinal (personID,ItemRank,LocRank,ItemDate,PrevDate,UseDate,ItemLocation,LocGroup,Notes) as
(
select
a.personID,a.ItemRank,a.LocRank,a.ItemDate,a.PrevDate,a.UseDate,a.Itemlocation,a.LocGroup,a.Notes
from ctefilter A left join ctefilter P on a.personID=p.personID and a.Itemrank=p.itemrank+1
where a.itemrank=1
or (a.locgroup='ER' and a.itemdate > dateadd(hh,48,p.usedate))
or (a.locgroup ='CLINIC' AND dateadd(d,90,coalesce(p.usedate,p.itemdate))<=a.itemdate
AND
(select locGroup from cte where itemrank=1 and personID=a.personid) ='ER')
)
select personID,ItemDate,Itemlocation,Notes from ctefinal
order by personid,itemdate;
--
select * from table1 where notes like 'Output%'
order by name,date;
Excellent link on simple talk for solving complex T-SQL problems step by step.
ASKER
SimonAdept,
Thank you so much, the solution works great with the sample data, however when i try to test against my real data i get this error: " transaction log for database tempdb is full error". I was doing some testing all day and it seems it does not like when we use the OR operation:
Thank you so much, the solution works great with the sample data, however when i try to test against my real data i get this error: " transaction log for database tempdb is full error". I was doing some testing all day and it seems it does not like when we use the OR operation:
OR ( A.locgroup = 'ER'
AND A.itemdate > Dateadd(hh, 48, P.usedate) )
OR ( A.locgroup = 'CLINIC'
if i remove that then the query will run fine. Thanks
Hi Moe,
How many rows in your live data?
How big is your tempdb?
How much free space is there on the drive?
Is tempdb a fixed size or is it allowed to autogrow? - allow autogrow if currently fixed (and there is drive space available)
Is it simple or full recovery model? - if full, I'd suggest setting it to simple.
If you're doing this sort of query on millions of rows of data you need appropriate server resources (disk space and RAM assigned to SQL Server).
As a short term fix for today:
1. You could try breaking your live data down into chunks (do an initial select into a staging table to get names beginning A-N, then M-Z).
2. You could also move some of the load out of tempdb by replacing the first cte with a query that creates a staging table in another database.
How many rows in your live data?
How big is your tempdb?
How much free space is there on the drive?
Is tempdb a fixed size or is it allowed to autogrow? - allow autogrow if currently fixed (and there is drive space available)
Is it simple or full recovery model? - if full, I'd suggest setting it to simple.
If you're doing this sort of query on millions of rows of data you need appropriate server resources (disk space and RAM assigned to SQL Server).
As a short term fix for today:
1. You could try breaking your live data down into chunks (do an initial select into a staging table to get names beginning A-N, then M-Z).
2. You could also move some of the load out of tempdb by replacing the first cte with a query that creates a staging table in another database.
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
thanks you so much, the database is set to unlimited auto-grow and it is simple recover model. It has 80gb free space. I guess the only option is to break it down the data and see if that helps. Thanks again, i really appreciate all your time and efforts on this.
Hi Moe, have you tried breaking down your source data and discovered a batch size that will work? If so, is that a workable method or would it take more than 10 batches to process all your data?
Please let me know if you would like assistance to modify the query to create and then use tables in another database to lighten the load on the tempdb.
Hope you're well down the road to completing your task.
Please let me know if you would like assistance to modify the query to create and then use tables in another database to lighten the load on the tempdb.
Hope you're well down the road to completing your task.
ASKER
Thanks Simon again for all your help, i could not do this without your precious time. I was finally able to do this by following your suggestion by replacing the cte with a table and that did it. thanks again for your support.
ASKER
I've requested that this question be closed as follows:
Accepted answer: 0 points for moe57's comment #a40485588
for the following reason:
Simon was great helping and providing all kind of support and guidance, i really appreciate your support.
Accepted answer: 0 points for moe57's comment #a40485588
for the following reason:
Simon was great helping and providing all kind of support and guidance, i really appreciate your support.
That's excellent news. Now I can go to sleep happy :)
Just for the record, how may rows of live data were you processing?
Did you only have to replace the top CTE with a table?
Just for the record, how may rows of live data were you processing?
Did you only have to replace the top CTE with a table?
Close Request Pending
moe57 requested that this question be closed by accepting moe57's comment #a40485588 (0 points) as the solution for the following reason:
Simon was great helping and providing all kind of support and guidance, i really appreciate your support.
Moe, thanks for your kind words. Please accept one of my comments as the solution, or I'll get zero credit for my efforts on this.
ASKER
great solution. thanks
ASKER
The clinic group is showing up multiple times and the rule was only to show one record if patient visits again another clinic group more than one time. I remember you helped me similar issue about a week ago where clinic records where showing up multiple times for the same patient and you applied an extra parenthesis and that seemed to fix the issue. I have tried that but did not work for me and i am not sure if i am doing it wrong. It is weird that it works in the sample data but not production data. I am using sql server 2008 R2. Thanks
Hi Moe,
Can you be a bit more specific about which clinic visits are showing. Is it all clinic visits, just those 90+ days after an ER visit or subsequent clinic visits for patients whose first visit was to a clinic?
Can you post the actual SQL you are now using, which I understand to now be an initial query to write data to a staging table, and then the remaining part of the solution I posted yesterday.
Can you be a bit more specific about which clinic visits are showing. Is it all clinic visits, just those 90+ days after an ER visit or subsequent clinic visits for patients whose first visit was to a clinic?
Can you post the actual SQL you are now using, which I understand to now be an initial query to write data to a staging table, and then the remaining part of the solution I posted yesterday.
ASKER
Hi SimonAdept,
it seems the subsequent visits for the same patients. Here is my actual query after i split the CTE:
it seems the subsequent visits for the same patients. Here is my actual query after i split the CTE:
-- this is the temp table for the first CTE
INSERT INTO TEMP_TABLE (INIT_UQ_NBR, RANK_UQ_NBR, UMPI_ADMT_TS, UMPI_LOC_TYPE, UMPI_LOC_GRP)
SELECT DISTINCT UQ_NBR AS INIT_UQ_NBR,
ROW_NUMBER()
OVER ( PARTITION BY UQ_NBR ORDER BY UQ_NBR, ADMT_TS)AS RANK_UQ_NBR,
Cast(ADMT_TS AS SMALLDATETIME)AS UMPI_ADMT_TS,
--ADMT_TS AS UMPI_ADMT_TS,
LOC_TYPE AS UMPI_LOC_TYPE,
CASE
WHEN LOC_TYPE IN ( 'ER', 'OBS', 'ELECTIVE', 'EMERGENCY', 'INPATIENT', 'OBSERVATION', 'URGENT' ) THEN 'ER'
WHEN LOC_TYPE IN ( 'CLINIC', 'OP', 'OUTPATIENT' ) THEN 'CLINIC' END AS UMPI_LOC_GRP
FROM METRIC_TABLE WHERE UQ_NBR IS NOT NULL
AND UQ_NBR <> '0000000000'
-- ),
--***************************
;WITH CTEFILTER (INIT_UQ_NBR, RANK_UQ_NBR, LOCRANK, UMPI_ADMT_TS, PREVADMT_TS, USEADMT_TS,
UMPI_LOC_TYPE, UMPI_LOC_GRP)
AS (
SELECT A.INIT_UQ_NBR,
ROW_NUMBER() OVER ( PARTITION BY A.INIT_UQ_NBR ORDER BY A.INIT_UQ_NBR, A.UMPI_ADMT_TS),
ROW_NUMBER() OVER ( ORDER BY A.UMPI_LOC_GRP, A.INIT_UQ_NBR),
A.UMPI_ADMT_TS,
COALESCE(P.UMPI_ADMT_TS, NULL) ,
CASE
WHEN A.UMPI_LOC_GRP = 'ER'
AND A.UMPI_ADMT_TS < Dateadd(hh, 48, COALESCE(P.UMPI_ADMT_TS, NULL))THEN P.UMPI_ADMT_TS
WHEN A.UMPI_LOC_GRP = 'CLINIC' AND A.UMPI_ADMT_TS < Dateadd(d, 90, COALESCE(P.UMPI_ADMT_TS, NULL)) THEN P.UMPI_ADMT_TS ELSE A.UMPI_ADMT_TS END AS USEADMT_TS,
A.UMPI_LOC_TYPE,
A.UMPI_LOC_GRP
FROM TEMP_TABLE A
LEFT JOIN TEMP_TABLE P
ON A.INIT_UQ_NBR = P.INIT_UQ_NBR
AND A.RANK_UQ_NBR = P.RANK_UQ_NBR + 1
) ,
CTEFINAL (INIT_UQ_NBR, RANK_UQ_NBR, LOCRANK, UMPI_ADMT_TS, PREVADMT_TS, USEADMT_TS,
UMPI_LOC_TYPE, UMPI_LOC_GRP)
AS (SELECT A.INIT_UQ_NBR,
A.RANK_UQ_NBR,
A.LOCRANK,
A.UMPI_ADMT_TS,
A.PREVADMT_TS,
A.USEADMT_TS,
A.UMPI_LOC_TYPE,
A.UMPI_LOC_GRP
FROM CTEFILTER A
LEFT JOIN CTEFILTER P
ON A.INIT_UQ_NBR = P.INIT_UQ_NBR
AND A.RANK_UQ_NBR = P.RANK_UQ_NBR + 1
WHERE A.RANK_UQ_NBR = 1
OR ( A.UMPI_LOC_GRP = 'ER'
AND A.UMPI_ADMT_TS > Dateadd(hh, 48, P.USEADMT_TS) )
OR ( A.UMPI_LOC_GRP = 'CLINIC'
AND
Dateadd(d, 90, COALESCE(P.USEADMT_TS, P.UMPI_ADMT_TS)) <= A.UMPI_ADMT_TS
AND ( SELECT UMPI_LOC_GRP
FROM TEMP_TABLE
WHERE RANK_UQ_NBR = 1
AND INIT_UQ_NBR = A.INIT_UQ_NBR ) = 'ER' ))
SELECT INIT_UQ_NBR AS UQ_NBR,
UMPI_ADMT_TS AS ADMT_DT,
UMPI_LOC_TYPE AS LOC_TYPE
FROM CTEFINAL
Thanks for looking into it
ASKER
Simon,
I take that back, it is not the subsequent visit for the same patient but it is only just those 90+ days after an ER visit, those are showing up multiple times. thanks
I take that back, it is not the subsequent visit for the same patient but it is only just those 90+ days after an ER visit, those are showing up multiple times. thanks
Hi Moe, Sorry, I didn't get a notification of your most recent post for some reason, I just happened to visit this thread again.
Thanks for posting your final query. I see you had a lot of column name editing to do.
Are you saying that there is still a problem?
Is it a business rule that for patients whose first visit is to ER, you do want to output the first visit to a clinic location that occurs >90 days from the ER visit, but no further clinic visits, no matter how many days after the ER visit or the previous clinic visit?
I've added a few further records to the test table I'm using and only get output for CLINIC visits for patients whose first visit was ER where they are more than 90 days apart.
Thanks for posting your final query. I see you had a lot of column name editing to do.
Are you saying that there is still a problem?
Is it a business rule that for patients whose first visit is to ER, you do want to output the first visit to a clinic location that occurs >90 days from the ER visit, but no further clinic visits, no matter how many days after the ER visit or the previous clinic visit?
I've added a few further records to the test table I'm using and only get output for CLINIC visits for patients whose first visit was ER where they are more than 90 days apart.
ASKER
thanks for getting back to me. Yes i still have the duplicate issue for the clinic group. I do want to output the first visit to a clinic location that occurs >90 days from the ER visit, but no further clinic visits from last ER visit. We always look at the last output record and that tells us if we output or not. For instance
Mike visits ER for the first time on 1/3, output this
Mike visits Clinic after 90 days from his last visit on 3/15, output this
Mike visits Clinic again 4/20, do not output this
any subsequent visits to Clinic, do not output
Mike visits ER on 3/18, output this because it is greater than 48 from his last clinic visit which was 3/15
Mike visits again Clinic after 90 days on 7/20 , we output this because his last visit was at the ER on 3/18. I hope i clarified little bit more. thanks again
Mike visits ER for the first time on 1/3, output this
Mike visits Clinic after 90 days from his last visit on 3/15, output this
Mike visits Clinic again 4/20, do not output this
any subsequent visits to Clinic, do not output
Mike visits ER on 3/18, output this because it is greater than 48 from his last clinic visit which was 3/15
Mike visits again Clinic after 90 days on 7/20 , we output this because his last visit was at the ER on 3/18. I hope i clarified little bit more. thanks again
ASKER
please try this data, you will see Mike went to clinic twice, one time on sept and the other time on October. Please see more detail the above post in which i provided example.
CREATE TABLE Table1
([NAME] varchar(5), [LOCATION] varchar(6), [DATE] datetime, [NOTES] varchar(72))
;
INSERT INTO Table1
([NAME], [LOCATION], [DATE], [NOTES])
VALUES
('MIKE', 'ER', '2014-01-01 00:00:00', 'output it is first visit'),
('MIKE', 'ER', '2014-01-02 00:00:00', 'do not output, it is <48'),
('MIKE', 'OBS', '2014-01-03 10:00:00', 'output this because it is >48 from his last visit on 1/1/2014'),
('MIKE', 'IP', '2014-01-04 00:00:00', 'do not output because it is less than 48 from his last visit on 1/3/2014'),
('MIKE', 'IP', '2014-01-29 00:00:00', 'output this, it is >48 from his last visit on 1/3/2014'),
('MIKE', 'OBS', '2014-04-15 01:00:00', 'output this, it is >48 from his last visit on 1/29/2014'),
('MIKE', 'CLINIC', '2014-05-20 01:00:00', 'do not output, it is <90 from his last visit on 4/15/2014'),
('MIKE', 'OP', '2014-06-20 01:00:00', 'do not output, it is <90 from his last visit on 4/15/2014'),
('MIKE', 'CLINIC', '2014-09-15 01:00:00', 'output, it is >90 from his ER visit on 4/15/2014'),
('MIKE', 'CLINIC', '2014-10-15 01:00:00', 'do not output, it is subsequent visit'),
('DAVID', 'CLINIC', '2014-03-03 00:00:00', 'output it is first visit'),
('DAVID', 'CLINIC', '2014-03-04 00:00:00', 'do not output'),
('DAVID', 'CLINIC', '2014-07-05 01:00:00', 'do not output'),
('DAVID', 'ER', '2014-03-15 00:00:00', 'output this because it is >48 from his last clinic visit on 3/3/2014'),
('DAVID', 'CLINIC', '2014-03-18 00:00:00', 'do not output because it is < 90 from his ER visit on 3/15/2014'),
('DAVID', 'OBS', '2014-03-19 00:00:00', 'output, it is >48 from his last ER visit on 3/15/2014')
;
Moe, these are a strange set of rules for the query. They seem inconsistent ,because if the patient first visit was to a clinic, the rules say to omit ALL future clinic visits, no matter if they happen to be >90 days after an ER visit that WAS output because it was >48 hours after the clinic visit.
2) Clinic Group
a) Output if patient makes his/her first visit to any of Clinic group locations
b) Do not output any subsequent visits to any of the Clinic group locations for the same patient
c) Output if same patient makes a visit to any of the ER group locations after 48 hours from their last clinic visit (last visit = last output record)
What I would suggest is to use the principles already demonstrated to remove the duplicate / consecutive clinic visits.
i.e. save your current final result as a table, then in new query
re-rank the now reduced set of records for each person by date.
self-join that on the same method (personID=personID AND rank=Rank+1)
exclude records where currentlocationType=prevlo cationType and currentlocationType = CLINIC
I'm sure you can do this yourself, but if you want further help, please restate as a new question, ensuring that your set of business rules covers all possible cases and describes them all in a clear, non-ambiguous way, and that your sample data table covers all the test cases. The rules as stated at the top of this question DID NOT disallow multiple visits to a clinic type location that were >90 days after an ER visit and 90 days apart.
I'm concerned that each time the rules get refined, the solution built for the previous iterations becomes less and less appropriate. I know it can be difficult to be both precise and concise when defining the question, but it is key do obtaining a good and scalable solution. I don't want you to keep hitting the "tempdb full" situation when you've just spent hours translating the sample code to your live environment.
i.e. save your current final result as a table, then in new query
re-rank the now reduced set of records for each person by date.
self-join that on the same method (personID=personID AND rank=Rank+1)
exclude records where currentlocationType=prevlo
I'm sure you can do this yourself, but if you want further help, please restate as a new question, ensuring that your set of business rules covers all possible cases and describes them all in a clear, non-ambiguous way, and that your sample data table covers all the test cases. The rules as stated at the top of this question DID NOT disallow multiple visits to a clinic type location that were >90 days after an ER visit and 90 days apart.
1) ER Group
a) Output this record, if patient makes his/her first visit within 48 hours to any of the ER group locations
b) Output any subsequent visits to any of the ER locations after 48 from their last visit (last visits = last output record)
c) output if same patient makes a visit to any of the Clinic group locations after 90 days from their last visit (last visit = last output record for this patient)
I'm concerned that each time the rules get refined, the solution built for the previous iterations becomes less and less appropriate. I know it can be difficult to be both precise and concise when defining the question, but it is key do obtaining a good and scalable solution. I don't want you to keep hitting the "tempdb full" situation when you've just spent hours translating the sample code to your live environment.
ASKER
okay, will try to do that. thanks for your help.
Open in new window