Solved

Query syntax

Posted on 2013-06-03
17
193 Views
Last Modified: 2013-06-07
Hello,

I am looking for a few pointers rather than a complete solution.

I have a table (well indexed) that contains 100000+ rows that lists PatientId, TreatmentDate, and a number of columns that list the patients treatment.


CREATE TABLE [dbo].[PatientTreatments](
[id] [int] IDENTITY(1,1) NOT NULL,
[PatientId] [nvarchar](12) NOT NULL,
[TreatmentDate] [date] NULL,
[Treatment_None] [int] NULL,
[Treatment_PDT] [int] NULL,
[Treatment_Laser] [int] NULL,
[Treatment_Lucentis] [int] NULL,
[Treatment_Macugen] [int] NULL,
[Treatment_Avastin] [int] NULL,
[Treatment_Triamcinolon] [int] NULL,
[Treatment_AnecortAveac] [int] NULL,
[Treatment_Annan] [int] NULL,
[TreatmentType] [int] NULL
) ON [PRIMARY]

Open in new window


I need to count the number of treatments that patients have had over a certain period.

I guess that it is a simple group by TreatmentType however there is one other piece of logic that needs to be addressed.

There are 8 different treatment types each with its own column in the database.  I am only interested in patients that have had the same treatment type in all thier visits.

So if a patient has changed treatment type his/her treatments should be excluded from the totals.

How would the experts recommend doing this?

Maybe there is a simple statement but i was wondering if i should create a temporary table with a "total value" for treament.  For example the combination of the treatment types could be seen as a binary value.  If i create a table with that value then it would be easy to compare records against each other.  

Then maybe a not exists in sub query could look for patients who have changed treatment.
0
Comment
Question by:soozh
  • 6
  • 5
  • 3
  • +2
17 Comments
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39215962
I think counts would satisfy this requirement. A count(distinct <<whatever>>) = 1 would mean no change has occurred within the period and think it should be ok in case any of those fields are null e.g.
select
  patientid
, count(treatmentdate) count_treatments
, count(distinct treatment_pdt) count_pdt
, count(distinct treatment_laser) count_laser
, count(distinct treatment_lucentis) count_lucentis
, count(distinct treatment_macugen) count_macugen
, count(distinct treatment_avastin) count_avastin
, count(distinct treatment_triamcinolon) count_triamcinolon
, count(distinct treatment_anecortaveac) count_anecortaveac
, count(distinct treatment_annan) count_annan
, count(distinct treatmenttype) count_type
from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
   count(distinct treatment_pdt) = 1
or count(distinct treatment_laser) = 1
or count(distinct treatment_lucentis) = 1
or count(distinct treatment_macugen) = 1
or count(distinct treatment_avastin) = 1
or count(distinct treatment_triamcinolon) = 1
or count(distinct treatment_anecortaveac) = 1
or count(distinct treatment_annan) = 1
;

select
*
from PatientTreatments as pt
inner join (
            select
              patientid
            , count(treatmentdate) tcount
            , count(distinct treatment_pdt) count_pdt
            , count(distinct treatment_laser) count_laser
            , count(distinct treatment_lucentis) count_lucentis
            , count(distinct treatment_macugen) count_macugen
            , count(distinct treatment_avastin) count_avastin
            , count(distinct treatment_triamcinolon) count_triamcinolon
            , count(distinct treatment_anecortaveac) count_anecortaveac
            , count(distinct treatment_annan) count_annan
            , count(distinct treatmenttype) count_type
            from PatientTreatments
            where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
            group by
              patientid
            ) as tcount
      on pt.patientid = tcount.patientid
where count_pdt=1

Open in new window

see: http://sqlfiddle.com/#!3/6da38/1

perhaps try this on some real data?
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39215987
can a patient receive multiple types of treatment during a single visit?

I'm not sure what the [TreatmentType] field is for, it appears that you have separate fields for each of the treatment types, so why do you need this field?

I see that those other treatment types columns are integers, what are the values associated with the fields?  If those values are 0/1, then I was thinking something that looks more like:

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39216023
sorry "TreatmentType" should be "treatment type".  Its not a database field.

And yes each treatment has its own column.

There can be muliple treatments.

They columsn contain 1 or 0 to indicated treatment or no treatment.
0
How Do You Stack Up Against Your Peers?

With today’s modern enterprise so dependent on digital infrastructures, the impact of major incidents has increased dramatically. Grab the report now to gain insight into how your organization ranks against your peers and learn best-in-class strategies to resolve incidents.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39216049
>>They columns contain 1 or 0 to indicated treatment or no treatment.
ah
I had assumed they were foreign keys...
hence distinct has no great significance, and as fyed indicates, changing those counts to sums would work instead
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39216090
How would the experts recommend doing this?
An expert would probably not design the table where a treatment is hard coded into separate columns.  That is a recipe for disaster, as you have discovered.
0
 

Author Comment

by:soozh
ID: 39216243
a real expert would realise that "hindsight is an exact science..." and leave it at that....
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39216257
SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    SUM(Treatment_PDT) = COUNT(*) OR
    SUM(Treatment_Laser) = COUNT(*) OR
    SUM(Treatment_Lucentis) = COUNT(*) OR
    SUM(Treatment_Macugen) = COUNT(*) OR
    SUM(Treatment_Avastin) = COUNT(*) OR
    SUM(Treatment_Triamcinolon) = COUNT(*) OR
    SUM(Treatment_AnecortAveac) = COUNT(*) OR
    SUM(Treatment_Annan) = COUNT(*)
ORDER BY
    PatientId
0
 

Author Comment

by:soozh
ID: 39217074
Thanks for the solutions.  However we are missing something.

The example above from ScottPletcher returns rows when a patient has multiple treatments at the same visit.

I have mentioned that a patient can have multiple treatments at any one visit, but also they can also have a visit without any treatment.  In this case they are still included in the result because they have not changed treatment.

What i am looking for is a count of the number of treatments a patient has had when they have not changed treatment.  The next step will be to look for patients that have only had a set of treatments.

Maybe this can not be done in a single statement?  Or maybe some dynamic sql?
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39217125
So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.

But if they have a visit with no treatments, what is the point of the visit?  Try the following, I've added a criteria to the WHERE clause to exclude those records where no treatment was performed.  This would, however eliminate those patients who only had one visit during the desired period and where no treatment occurred.

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
AND ([Treatment_Laser] + [Treatment_lucentis] + [Treatment_macugen] + [Treatment_avastin] + [Treatment_triamcinolon] + [treatment_anecortaveac] + [treatment_annan]) = 0
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39217214
Thanks for the the response.

So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.


No this is not correct.  I am looking for patients that have only had the same type of treatment in all the visits where they have been treated.  So if they have had PDT and only PDT then we should include them...but if the had some other treatment at any time (either on a visit with PDT, or on a visit with just the new treatment they should be excluded.

But if they have a visit with no treatments, what is the point of the visit?  


We dont know if we are going to give the patient any treatment until they have been examined at the visit.  If they are lucky we can tell them to go home.  However the doctor may decide to treat them - hence the visits without treatment.

We follow the patient even when they are not being treated - in this way we know the treatment has been and still is successfull.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39217361
I thought you might have meant that, but wasn't 100% sure from the q.

I think we can just change the ORs in the HAVING to ANDs (one reason I wrote the query specifically that way):


HAVING
    SUM(Treatment_PDT) = COUNT(*) AND
    SUM(Treatment_Laser) = COUNT(*) AND
    SUM(Treatment_Lucentis) = COUNT(*) AND
    SUM(Treatment_Macugen) = COUNT(*) AND
    SUM(Treatment_Avastin) = COUNT(*) AND
    SUM(Treatment_Triamcinolon) = COUNT(*) AND
    SUM(Treatment_AnecortAveac) = COUNT(*) AND
    SUM(Treatment_Annan) = COUNT(*)
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217824
all those AND conditions will only return a match if all treatments are provided in each visit

i.e. if the SUM(of any treatment) = 0  then that patient gets excluded
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217910
in the following: use of a CTE is purely to simplify the required filtering (i.e. the same could be achieved inside a having clause but it's just ugly that way).

please see it in action here: http://sqlfiddle.com/#!3/a961f/1 
;WITH CTE1 AS (
                SELECT
                  patientid
                             /* 1.0 is to provide decimal results later */
                , 1.0 * count(CASE WHEN (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0 THEN 1 ELSE NULL END) count_treatments
                , sum(treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) sum_treatments
                , sum(treatment_pdt) sum_pdt
                , sum(treatment_laser) sum_laser
                , sum(treatment_lucentis) sum_lucentis
                , sum(treatment_macugen) sum_macugen
                , sum(treatment_avastin) sum_avastin
                , sum(treatment_triamcinolon) sum_triamcinolon
                , sum(treatment_anecortaveac) sum_anecortaveac
                , sum(treatment_annan) sum_annan
                FROM PatientTreatments

                   /* modify where clause to suit, e.g. data range */
                WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')

                   /* nb excluding "no treatment" records here to avoid divide by zero later */
                AND (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0
                GROUP BY
                  patientid
             )
SELECT
*
FROM CTE1
WHERE sum_pdt       /count_treatments IN (0.0,1.0)
AND sum_laser       /count_treatments IN (0.0,1.0)
AND sum_lucentis    /count_treatments IN (0.0,1.0)
AND sum_macugen     /count_treatments IN (0.0,1.0)
AND sum_avastin     /count_treatments IN (0.0,1.0)
AND sum_triamcinolon/count_treatments IN (0.0,1.0)
AND sum_anecortaveac/count_treatments IN (0.0,1.0)
AND sum_annan       /count_treatments IN (0.0,1.0)
;

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217916
mmm,
line 6 is optional
line 5 could probably just be

, 1.0 * count(*) as count_treatment

(it survived from an earlier iteration and I overlooked changing it)
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
ID: 39218119
>>The next step will be to look for patients that have only had a set of treatments.
this may seem like overkill, but perhaps the following may set you up for easier analysis of treatment sets (by "unpivoting" the data). Note if you do pursue this route it may pay to move the unpivoted data into a (temp?) table with appropriate indexes - here I've just used CTEs to illustrate
;WITH
CTE1 AS (
            SELECT
                PatientId
              , TreatmentDate
              , TreatedWith
              , isTreated
            FROM PatientTreatments
            CROSS apply (
                          VALUES
                           ('pdt'         ,Treatment_PDT)
                          ,('laser'       ,Treatment_Laser)
                          ,('lucentis'    ,Treatment_Lucentis)
                          ,('macugen'     ,Treatment_Macugen)
                          ,('avastin'     ,Treatment_Avastin)
                          ,('triamcinolon',Treatment_Triamcinolon)
                          ,('anecortaveac',Treatment_AnecortAveac)
                          ,('annan'       ,Treatment_Annan)
                        ) AS CA1(TreatedWith, isTreated)

                     /* modify where clause to suit, e.g. data range */
             WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')
             )
, CTE2 AS (
            SELECT
                PatientId
              , TreatedWith
              , count(DISTINCT TreatmentDate) AS count_dates
              , sum(isTreated)                AS sum_treatments
             FROM CTE1
             WHERE istreated = 1
             GROUP BY
                PatientId
              , TreatedWith
          )
SELECT
  PatientId
, min(count_dates)    as count_dates
, sum(sum_treatments) as sum_treatments
FROM CTE2
GROUP BY
  PatientId
HAVING sum(count_dates) = sum(sum_treatments)

Open in new window

see this at: http://sqlfiddle.com/#!3/2bd98/1
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 250 total points
ID: 39220069
CORRECTION:

Sorry, quite right, I rushed my previous HAVING code too much.  Please try this:


SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    (SUM(Treatment_PDT) = COUNT(*) OR SUM(Treatment_PDT) = 0) AND
    (SUM(Treatment_Laser) = COUNT(*) OR SUM(Treatment_Laser) = 0) AND
    (SUM(Treatment_Lucentis) = COUNT(*) OR SUM(Treatment_Lucentis) = 0) AND
    (SUM(Treatment_Macugen) = COUNT(*) OR SUM(Treatment_Macugen) = 0) AND
    (SUM(Treatment_Avastin) = COUNT(*) OR SUM(Treatment_Avastin) = 0) AND
    (SUM(Treatment_Triamcinolon) = COUNT(*) OR SUM(Treatment_Triamcinolon) = 0) AND
    (SUM(Treatment_AnecortAveac) = COUNT(*) OR SUM(Treatment_AnecortAveac) = 0) AND
    (SUM(Treatment_Annan) = COUNT(*) OR SUM(Treatment_Annan) = 0)
0
 

Author Closing Comment

by:soozh
ID: 39228917
I decided to go down the route of unpivoting the data.  

I have a followup question which i will ask in a new thread.  

Thanks for the fantastic help.
0

Featured Post

Guide to Performance: Optimization & Monitoring

Nowadays, monitoring is a mixture of tools, systems, and codes—making it a very complex process. And with this complexity, comes variables for failure. Get DZone’s new Guide to Performance to learn how to proactively find these variables and solve them before a disruption occurs.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

'Between' is such a common word we rarely think about it but in SQL it has a very specific definition we should be aware of. While most database vendors will have their own unique phrases to describe it (see references at end) the concept in common …
SQL Server engine let you use a Windows account or a SQL Server account to connect to a SQL Server instance. This can be configured immediatly during the SQL Server installation or after in the Server Authentication section in the Server properties …
Finds all prime numbers in a range requested and places them in a public primes() array. I've demostrated a template size of 30 (2 * 3 * 5) but larger templates can be built such 210  (2 * 3 * 5 * 7) or 2310  (2 * 3 * 5 * 7 * 11). The larger templa…
How to Install VMware Tools in Red Hat Enterprise Linux 6.4 (RHEL 6.4) Step-by-Step Tutorial

679 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question