Solved

how to find first record from dataset and group them by location using SQL

Posted on 2014-11-29
19
131 Views
Last Modified: 2014-12-04
The trick for my challenge is to find where the patient was FIRST seen, meaning was the patient seen at the clinic first or at the emergency room, once we find out that then I need to group them and output the data. Here are the criteria and the logic that has to be used when outputting the final data:
here is the logic that needs to be followed.
1) ER first visit
   a) if patient's first visit is ER, then display it and all other subsequent visits to ER like 2nd, 3rd etc
   b) if same patient goes to clinic after 90 days then display it
2) Clinic first visit
  a) if patient first visit is clinic then display it
  b) if same patient goes to clinic again do not display it
  c) if same patient goes to ER after 48 hours then display it.

Example:
- Mike first visit goes to ER on 1/1/2014 :- output this data
 - Mike goes again to ER on 1/3/2014 :- output this data
-  Mike goes to Clinic on 1/20/2014 :- do not output this data
- Mike goes to Clinic on 5/2/2014 : output this data
- David first visit goes to Clinic on 3/3/2014 : output this data
-  David goes back again to Clinic on 3/4/2014 : do not output this data
-  David goes to ER on 3/15/2014 : output this data

Open in new window


here is how my data looks like :
NAME    LOCATION           DATE
MIKE             ER                  1/1/2014
MIKE             ER                  1/3/2014
MIKE         CLINIC              1/20/2014
MIKE         CLINIC              5/2/2014
DAVID       CLINIC             3/3/2014
DAVID        CLINIC            3/4/2014
DAVID           ER                 3/15/2014

Open in new window


here is the desired results after the query is run:
NAME      LOCATION         DATE
    MIKE           ER               1/1/2014
    MIKE           ER               1/3/2014       
    MIKE      CLINIC             5/2/2014
    DAVID     CLINIC            3/3/2014
    DAVID      ER                   3/15/2014

Open in new window


here is the query i have started but not finished it
SET NOCOUNT ON
GO

    DECLARE @Init_DataSource TABLE
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO @Init_DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

SELECT * 
FROM @Init_DataSource
ORDER BY 1 DESC, 2 ASC

Open in new window

0
Comment
Question by:moe57
  • 10
  • 8
19 Comments
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Hi, I've done this on SQL Fiddle:
http://sqlfiddle.com/#!3/200bb/32

Schema:
   create TABLE DataSource 
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

Open in new window


Query:
with cte1 (personID,ItemRank,ItemDate,ItemLocation) 
as
(select 
name
 ,ROW_NUMBER()  OVER (partition by name ORDER BY name,date,location) AS "Row Number"
,date,location
from datasource
)
select itemrank,datasource.* 
from cte1 inner join datasource 
on cte1.personID=datasource.name AND cte1.itemdate=datasource.date

where ItemRank=1
--first visit was ER
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER'
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
  AND location = 'ER')

Open in new window


At present, this will only work if properly if the patient has only one event per day, as the date forms part of the join condition.
The data schema could be improved by addition of a unique ID to each patient event.
0
 

Author Comment

by:moe57
Comment Utility
thank you so much for your help, so the only thing i need to change now this line
--first visit was ER
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER'
or (select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name)
  AND location = 'ER')

Open in new window

because i was told to use Netezza database so i need to convert these codes to some  kind of postgresql or some syntax that supports netezza.  It seems to work find in SQL but not in Netezza.  thanks
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Hi I don't know netezza. Could you first try taking the comment lines out of the where clause. I would never normally put comments anywhere inside a select statement, and was mildly surprised that they were accepted.
Your question was tagged as SQL Server 2008. Which version of postgreSQL does this need to work on? Are you using "PureData System for Analytics 7.0.3" or "InfoSphere BigInsights 2.1.2" or "PureData System for Hadoop 1.0.0" or something else?
0
 

Author Comment

by:moe57
Comment Utility
okay i have decided to stick with SQL server and foret about the Postgresql or Netezza.  There is one problem with this solution, for the clinic visit, each patient should only show up one time and this is only true for first time clinic visits.  For example, patient A went to clinic twice first time on 5/1/2014 and second time on 5/3/2014 so in this case we only need to show the first visit for this patient (patient A).  Any subsequent visits to clinic for the same patient should be deleted.
i have added one more record for David just to test the result and David showed up twice for clinic visits.  here is the data i have used:
  create  TABLE dbo.DataSource 
    (
         [NAME] VARCHAR(20)
        ,[DATE] DATE
        ,[LOCATION] VARCHAR(15)
    )

    INSERT INTO DataSource ([NAME], [DATE], [LOCATION])
    VALUES ('MIKE', '1-1-2014', 'ER')
          ,('MIKE', '1-3-2014', 'ER')
          ,('MIKE', '1-20-2014', 'CLINIC')
          ,('MIKE', '5-2-2014', 'CLINIC')
          ,('DAVID', '3-3-2014', 'CLINIC')
          ,('DAVID', '3-4-2014', 'CLINIC')
          ,('DAVID', '3-7-2014', 'CLINIC')
          ,('DAVID', '3-15-2014', 'ER')

Open in new window

thanks
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Hi, I am only seeing one clinic visit for each patient, even with your revised dataset:
ITEMRANK 	NAME 	DATE 	LOCATION
1 	DAVID 	2014-03-03 	CLINIC
4 	DAVID 	2014-03-15 	ER
1 	MIKE 	2014-01-01 	ER
2 	MIKE 	2014-01-03 	ER
4 	MIKE 	2014-05-02 	CLINIC

Open in new window


See http://sqlfiddle.com/#!3/0485f/2

I can't replicate your result on sqlfiddle, and don't have access to a physical SQL Server to test on today. Which version of SQL Server have you tested on and obtained the incorrect result?

To be clear on the logic, your initial description does not necessarily exclude more than one clinic visit being shown for a patient IF they were first seen in ER, and then visit the clinic several times after 90 days have elapsed.

So, if I add another record for Mike ('MIKE', '5-4-2014', 'CLINIC'), that will also show in the results.
0
 

Author Comment

by:moe57
Comment Utility
it is weird, i am seeing David went to Clinic twice, one is on 3/3/2014 and the other one is 3/7/2014.  I am using SQL Server 2008 R2.  If the patient first visit was at the clinic then any subsequent visits should be excluded but it is okay if patient first visit was at the ER then goes to the clinic several times later
0
 
LVL 18

Accepted Solution

by:
SimonAdept earned 500 total points
Comment Utility
Weird. I'll double-check on my 2008R2 server tomorrow. In the meantime, would you mind running this version (with extra parentheses)
with cte1 (personID,ItemRank,ItemDate,ItemLocation) 
as
(select 
name
 ,ROW_NUMBER()  OVER (partition by name ORDER BY name,date,location) AS "Row Number"
,date,location
from datasource
)
select itemrank,datasource.* 
from cte1 inner join datasource 
on cte1.personID=datasource.name AND cte1.itemdate=datasource.date

where ItemRank=1
--first visit was ER
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='ER')
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'ER' AND location ='CLINIC'
  AND dateadd(d, -90,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name))
--first visit was CLINIC
or ((select itemlocation from cte1 where itemrank=1 and personID=name)= 'CLINIC' 
  AND (dateadd(d, -2,itemdate) >(select itemdate from cte1 where itemrank=1 and personID=name))
  AND location = 'ER')
ORDER By name,itemrank

Open in new window


If you can post your results showing the extra CLINIC line that would be helpful.
0
 

Author Comment

by:moe57
Comment Utility
actually this has fixed the problem.  Thank you so much for your help.
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Thanks for confirming. Are you happy that the above is the solution?
0
Highfive Gives IT Their Time Back

Highfive is so simple that setting up every meeting room takes just minutes and every employee will be able to start or join a call from any room with ease. Never be called into a meeting just to get it started again. This is how video conferencing should work!

 

Author Comment

by:moe57
Comment Utility
thanks for your follow up, i did not close the question because i was having an issue when i run the query in production and the data is huge so it was failing on me and getting this error:
in my real data, i have other locations other than ER, something like this:
OR ((SELECT ITEMLOC_TYPE FROM CTE1 WHERE ITEMRANK = 1 AND PERSONID=PI_NBR) IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG') AND LOC_TYPE IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG'))
OR ((SELECT ITEMLOC_TYPE FROM CTE1 WHERE ITEMRANK = 1 AND PERSONID=PI_NBR) IN ('ER', 'OBS', 'ELEC','EXY','INP','OBS', 'URG') AND LOC_TYPE IN ('CLINIC', 'OP', 'OUTP')

Open in new window

so is there a way to overcome this issue or adjust the query like using subquery so i can overcome this issue?  thanks
0
 

Author Comment

by:moe57
Comment Utility
the query runs fine if only highlight until here:
WHERE ITEMRANK=1
but as soon as i add the OR operation then it will run forever and eventually fails.  Thanks
0
 

Author Comment

by:moe57
Comment Utility
Simon, can you help please?  thanks
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Hi Moe, please post your complete modified select statement that includes the other location types. If server resources do not permit it may be necessary to approach this completely differently, using temporary tables. Your greatly expanded list of location types makes the task considerably more complex. I'd really prefer that you close off this question and restate your scenario more fully in a new one, using this one as a starting point.
0
 

Author Comment

by:moe57
Comment Utility
okay will do that now.  I will close this one now and i already created another post in which i have incorporated some littel changes in the logic.  Here is the link for my new post.  thank you so much for your help.
http://www.experts-exchange.com/Database/MS-SQL-Server/Q_28573340.html
0
 

Author Comment

by:moe57
Comment Utility
I've requested that this question be closed as follows:

Accepted answer: 0 points for moe57's comment #a40476067

for the following reason:

thanks for your help,
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
Hi Moe,

When I said 'close off this question' I meant that the answer I'd given was a working solution for the question as stated. I'll look at your new question, but I think it would be fair to give credit for the effort already made.
0
 
LVL 18

Expert Comment

by:SimonAdept
Comment Utility
This comment is the one that I consider to be the solution.
0
 

Author Closing Comment

by:moe57
Comment Utility
this was the correct answer but new requirements for different logic.  Thanks
0

Featured Post

PRTG Network Monitor: Intuitive Network Monitoring

Network Monitoring is essential to ensure that computer systems and network devices are running. Use PRTG to monitor LANs, servers, websites, applications and devices, bandwidth, virtual environments, remote systems, IoT, and many more. PRTG is easy to set up & use.

Join & Write a Comment

Occasionally there is a need to clean table columns, especially if you have inherited legacy data. There are obviously many ways to accomplish that, including elaborate UPDATE queries with anywhere from one to numerous REPLACE functions (even within…
Ever wondered why sometimes your SQL Server is slow or unresponsive with connections spiking up but by the time you go in, all is well? The following article will show you how to install and configure a SQL job that will send you email alerts includ…
Using examples as well as descriptions, and references to Books Online, show the different Recovery Models available in SQL Server and explain, as well as show how full, differential and transaction log backups are performed
Via a live example, show how to backup a database, simulate a failure backup the tail of the database transaction log and perform the restore.

744 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question

Need Help in Real-Time?

Connect with top rated Experts

16 Experts available now in Live!

Get 1:1 Help Now