how to find first record from dataset and group them by location using SQL

The trick for my challenge is to find where the patient was FIRST seen, meaning was the patient seen at the clinic first or at the emergency room, once we find out that then I need to group them and output the data. Here are the criteria and the logic that has to be used when outputting the final data:
here is the logic that needs to be followed.
1) ER first visit
   a) if patient's first visit is ER, then display it and all other subsequent visits to ER like 2nd, 3rd etc
   b) if same patient goes to clinic after 90 days then display it
2) Clinic first visit
  a) if patient first visit is clinic then display it
  b) if same patient goes to clinic again do not display it
  c) if same patient goes to ER after 48 hours then display it.

Example:
- Mike first visit goes to ER on 1/1/2014 :- output this data
 - Mike goes again to ER on 1/3/2014 :- output this data
-  Mike goes to Clinic on 1/20/2014 :- do not output this data
- Mike goes to Clinic on 5/2/2014 : output this data
- David first visit goes to Clinic on 3/3/2014 : output this data
-  David goes back again to Clinic on 3/4/2014 : do not output this data
-  David goes to ER on 3/15/2014 : output this data

Open in new window


here is how my data looks like :
NAME    LOCATION           DATE
MIKE             ER                  1/1/2014
MIKE             ER                  1/3/2014
MIKE         CLINIC              1/20/2014
MIKE         CLINIC              5/2/2014
DAVID       CLINIC             3/3/2014
DAVID        CLINIC            3/4/2014
DAVID           ER                 3/15/2014

Open in new window


here is the desired results after the query is run:
NAME      LOCATION         DATE
    MIKE           ER               1/1/2014
    MIKE           ER               1/3/2014       
    MIKE      CLINIC             5/2/2014
    DAVID     CLINIC            3/3/2014
    DAVID      ER                   3/15/2014

Open in new window


here is the query i have started but not finished it
SET NOCOUNT ON
GO

    DECLARE @Init_DataSource TABLE
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO @Init_DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

SELECT * 
FROM @Init_DataSource
ORDER BY 1 DESC, 2 ASC

Open in new window

moe57Asked:
Who is Participating?
I wear a lot of hats...

"The solutions and answers provided on Experts Exchange have been extremely helpful to me over the last few years. I wear a lot of hats - Developer, Database Administrator, Help Desk, etc., so I know a lot of things but not a lot about one thing. Experts Exchange gives me answers from people who do know a lot about one thing, in a easy to use platform." -Todd S.

SimonCommented:
Hi, I've done this on SQL Fiddle:
http://sqlfiddle.com/#!3/200bb/32

Schema:
   create TABLE DataSource 
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

Open in new window


Query:
with cte1 (personID,ItemRank,ItemDate,ItemLocation) 
as
(select 
name
 ,ROW_NUMBER()  OVER (partition by name ORDER BY name,date,location) AS "Row Number"
,date,location
from datasource
)
select itemrank,datasource.* 
from cte1 inner join datasource 
on cte1.personID=datasource.name AND cte1.itemdate=datasource.date

where ItemRank=1
--first visit was ER
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER'
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
  AND location = 'ER')

Open in new window


At present, this will only work if properly if the patient has only one event per day, as the date forms part of the join condition.
The data schema could be improved by addition of a unique ID to each patient event.
0
moe57Author Commented:
thank you so much for your help, so the only thing i need to change now this line
--first visit was ER
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER'
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
  AND location = 'ER')

Open in new window

because i was told to use Netezza database so i need to convert these codes to some  kind of postgresql or some syntax that supports netezza.  It seems to work find in SQL but not in Netezza.  thanks
0
SimonCommented:
Hi I don't know netezza. Could you first try taking the comment lines out of the where clause. I would never normally put comments anywhere inside a select statement, and was mildly surprised that they were accepted.
Your question was tagged as SQL Server 2008. Which version of postgreSQL does this need to work on? Are you using "PureData System for Analytics 7.0.3" or "InfoSphere BigInsights 2.1.2" or "PureData System for Hadoop 1.0.0" or something else?
0
Ultimate Tool Kit for Technology Solution Provider

Broken down into practical pointers and step-by-step instructions, the IT Service Excellence Tool Kit delivers expert advice for technology solution providers. Get your free copy now.

moe57Author Commented:
okay i have decided to stick with SQL server and foret about the Postgresql or Netezza.  There is one problem with this solution, for the clinic visit, each patient should only show up one time and this is only true for first time clinic visits.  For example, patient A went to clinic twice first time on 5/1/2014 and second time on 5/3/2014 so in this case we only need to show the first visit for this patient (patient A).  Any subsequent visits to clinic for the same patient should be deleted.
i have added one more record for David just to test the result and David showed up twice for clinic visits.  here is the data i have used:
  create  TABLE dbo.DataSource 
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-7-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

Open in new window

thanks
0
SimonCommented:
Hi, I am only seeing one clinic visit for each patient, even with your revised dataset:
ITEMRANK 	NAME 	DATE 	LOCATION
1 	DAVID 	2014-03-03 	CLINIC
4 	DAVID 	2014-03-15 	ER
1 	MIKE 	2014-01-01 	ER
2 	MIKE 	2014-01-03 	ER
4 	MIKE 	2014-05-02 	CLINIC

Open in new window


See http://sqlfiddle.com/#!3/0485f/2

I can't replicate your result on sqlfiddle, and don't have access to a physical SQL Server to test on today. Which version of SQL Server have you tested on and obtained the incorrect result?

To be clear on the logic, your initial description does not necessarily exclude more than one clinic visit being shown for a patient IF they were first seen in ER, and then visit the clinic several times after 90 days have elapsed.

So, if I add another record for Mike ('MIKE', '5-4-2014', 'CLINIC'), that will also show in the results.
0
moe57Author Commented:
it is weird, i am seeing David went to Clinic twice, one is on 3/3/2014 and the other one is 3/7/2014.  I am using SQL Server 2008 R2.  If the patient first visit was at the clinic then any subsequent visits should be excluded but it is okay if patient first visit was at the ER then goes to the clinic several times later
0
SimonCommented:
Weird. I'll double-check on my 2008R2 server tomorrow. In the meantime, would you mind running this version (with extra parentheses)
with cte1 (personID,ItemRank,ItemDate,ItemLocation) 
as
(select 
name
 ,ROW_NUMBER()  OVER (partition by name ORDER BY name,date,location) AS "Row Number"
,date,location
from datasource
)
select itemrank,datasource.* 
from cte1 inner join datasource 
on cte1.personID=datasource.name AND cte1.itemdate=datasource.date

where ItemRank=1
--first visit was ER
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER')
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name))
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND (dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name))
  AND location = 'ER')
ORDER By name,itemrank

Open in new window


If you can post your results showing the extra CLINIC line that would be helpful.
0

Experts Exchange Solution brought to you by

Your issues matter to us.

Facing a tech roadblock? Get the help and guidance you need from experienced professionals who care. Ask your question anytime, anywhere, with no hassle.

Start your 7-day free trial
moe57Author Commented:
actually this has fixed the problem.  Thank you so much for your help.
0
SimonCommented:
Thanks for confirming. Are you happy that the above is the solution?
0
moe57Author Commented:
thanks for your follow up, i did not close the question because i was having an issue when i run the query in production and the data is huge so it was failing on me and getting this error:
in my real data, i have other locations other than ER, something like this:
OR ((SELECT ITEMLOC_TYPE FROM CTE1 WHERE ITEMRANK = 1 AND PERSONID=PI_NBR) IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG') AND LOC_TYPE IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG'))
OR ((SELECT ITEMLOC_TYPE FROM CTE1 WHERE ITEMRANK = 1 AND PERSONID=PI_NBR) IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG') AND LOC_TYPE IN ('CLINIC', 'OP', 'OUTP')

Open in new window

so is there a way to overcome this issue or adjust the query like using subquery so i can overcome this issue?  thanks
0
moe57Author Commented:
the query runs fine if only highlight until here:
WHERE ITEMRANK=1
but as soon as i add the OR operation then it will run forever and eventually fails.  Thanks
0
moe57Author Commented:
Simon, can you help please?  thanks
0
SimonCommented:
Hi Moe, please post your complete modified select statement that includes the other location types. If server resources do not permit it may be necessary to approach this completely differently, using temporary tables. Your greatly expanded list of location types makes the task considerably more complex. I'd really prefer that you close off this question and restate your scenario more fully in a new one, using this one as a starting point.
0
moe57Author Commented:
okay will do that now.  I will close this one now and i already created another post in which i have incorporated some littel changes in the logic.  Here is the link for my new post.  thank you so much for your help.
http://www.experts-exchange.com/Database/MS-SQL-Server/Q_28573340.html
0
moe57Author Commented:
I've requested that this question be closed as follows:

Accepted answer: 0 points for moe57's comment #a40476067

for the following reason:

thanks for your help,
0
SimonCommented:
Hi Moe,

When I said 'close off this question' I meant that the answer I'd given was a working solution for the question as stated. I'll look at your new question, but I think it would be fair to give credit for the effort already made.
0
SimonCommented:
This comment is the one that I consider to be the solution.
0
moe57Author Commented:
this was the correct answer but new requirements for different logic.  Thanks
0
It's more than this solution.Get answers and train to solve all your tech problems - anytime, anywhere.Try it for free Edge Out The Competitionfor your dream job with proven skills and certifications.Get started today Stand Outas the employee with proven skills.Start learning today for free Move Your Career Forwardwith certification training in the latest technologies.Start your trial today
Microsoft SQL Server

From novice to tech pro — start learning today.

Question has a verified solution.

Are you are experiencing a similar issue? Get a personalized answer when you ask a related question.

Have a better answer? Share it in a comment.