Solved

Query syntax

Posted on 2013-06-03
17
194 Views
Last Modified: 2013-06-07
Hello,

I am looking for a few pointers rather than a complete solution.

I have a table (well indexed) that contains 100000+ rows that lists PatientId, TreatmentDate, and a number of columns that list the patients treatment.


CREATE TABLE [dbo].[PatientTreatments](
[id] [int] IDENTITY(1,1) NOT NULL,
[PatientId] [nvarchar](12) NOT NULL,
[TreatmentDate] [date] NULL,
[Treatment_None] [int] NULL,
[Treatment_PDT] [int] NULL,
[Treatment_Laser] [int] NULL,
[Treatment_Lucentis] [int] NULL,
[Treatment_Macugen] [int] NULL,
[Treatment_Avastin] [int] NULL,
[Treatment_Triamcinolon] [int] NULL,
[Treatment_AnecortAveac] [int] NULL,
[Treatment_Annan] [int] NULL,
[TreatmentType] [int] NULL
) ON [PRIMARY]

Open in new window


I need to count the number of treatments that patients have had over a certain period.

I guess that it is a simple group by TreatmentType however there is one other piece of logic that needs to be addressed.

There are 8 different treatment types each with its own column in the database.  I am only interested in patients that have had the same treatment type in all thier visits.

So if a patient has changed treatment type his/her treatments should be excluded from the totals.

How would the experts recommend doing this?

Maybe there is a simple statement but i was wondering if i should create a temporary table with a "total value" for treament.  For example the combination of the treatment types could be seen as a binary value.  If i create a table with that value then it would be easy to compare records against each other.  

Then maybe a not exists in sub query could look for patients who have changed treatment.
0
Comment
Question by:soozh
[X]
Welcome to Experts Exchange

Add your voice to the tech community where 5M+ people just like you are talking about what matters.

  • Help others & share knowledge
  • Earn cash & points
  • Learn & ask questions
  • 6
  • 5
  • 3
  • +2
17 Comments
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39215962
I think counts would satisfy this requirement. A count(distinct <<whatever>>) = 1 would mean no change has occurred within the period and think it should be ok in case any of those fields are null e.g.
select
  patientid
, count(treatmentdate) count_treatments
, count(distinct treatment_pdt) count_pdt
, count(distinct treatment_laser) count_laser
, count(distinct treatment_lucentis) count_lucentis
, count(distinct treatment_macugen) count_macugen
, count(distinct treatment_avastin) count_avastin
, count(distinct treatment_triamcinolon) count_triamcinolon
, count(distinct treatment_anecortaveac) count_anecortaveac
, count(distinct treatment_annan) count_annan
, count(distinct treatmenttype) count_type
from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
   count(distinct treatment_pdt) = 1
or count(distinct treatment_laser) = 1
or count(distinct treatment_lucentis) = 1
or count(distinct treatment_macugen) = 1
or count(distinct treatment_avastin) = 1
or count(distinct treatment_triamcinolon) = 1
or count(distinct treatment_anecortaveac) = 1
or count(distinct treatment_annan) = 1
;

select
*
from PatientTreatments as pt
inner join (
            select
              patientid
            , count(treatmentdate) tcount
            , count(distinct treatment_pdt) count_pdt
            , count(distinct treatment_laser) count_laser
            , count(distinct treatment_lucentis) count_lucentis
            , count(distinct treatment_macugen) count_macugen
            , count(distinct treatment_avastin) count_avastin
            , count(distinct treatment_triamcinolon) count_triamcinolon
            , count(distinct treatment_anecortaveac) count_anecortaveac
            , count(distinct treatment_annan) count_annan
            , count(distinct treatmenttype) count_type
            from PatientTreatments
            where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
            group by
              patientid
            ) as tcount
      on pt.patientid = tcount.patientid
where count_pdt=1

Open in new window

see: http://sqlfiddle.com/#!3/6da38/1

perhaps try this on some real data?
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39215987
can a patient receive multiple types of treatment during a single visit?

I'm not sure what the [TreatmentType] field is for, it appears that you have separate fields for each of the treatment types, so why do you need this field?

I see that those other treatment types columns are integers, what are the values associated with the fields?  If those values are 0/1, then I was thinking something that looks more like:

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39216023
sorry "TreatmentType" should be "treatment type".  Its not a database field.

And yes each treatment has its own column.

There can be muliple treatments.

They columsn contain 1 or 0 to indicated treatment or no treatment.
0
Webinar: Aligning, Automating, Winning

Join Dan Russo, Senior Manager of Operations Intelligence, for an in-depth discussion on how Dealertrack, leading provider of integrated digital solutions for the automotive industry, transformed their DevOps processes to increase collaboration and move with greater velocity.

 
LVL 48

Expert Comment

by:PortletPaul
ID: 39216049
>>They columns contain 1 or 0 to indicated treatment or no treatment.
ah
I had assumed they were foreign keys...
hence distinct has no great significance, and as fyed indicates, changing those counts to sums would work instead
0
 
LVL 75

Expert Comment

by:Anthony Perkins
ID: 39216090
How would the experts recommend doing this?
An expert would probably not design the table where a treatment is hard coded into separate columns.  That is a recipe for disaster, as you have discovered.
0
 

Author Comment

by:soozh
ID: 39216243
a real expert would realise that "hindsight is an exact science..." and leave it at that....
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39216257
SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    SUM(Treatment_PDT) = COUNT(*) OR
    SUM(Treatment_Laser) = COUNT(*) OR
    SUM(Treatment_Lucentis) = COUNT(*) OR
    SUM(Treatment_Macugen) = COUNT(*) OR
    SUM(Treatment_Avastin) = COUNT(*) OR
    SUM(Treatment_Triamcinolon) = COUNT(*) OR
    SUM(Treatment_AnecortAveac) = COUNT(*) OR
    SUM(Treatment_Annan) = COUNT(*)
ORDER BY
    PatientId
0
 

Author Comment

by:soozh
ID: 39217074
Thanks for the solutions.  However we are missing something.

The example above from ScottPletcher returns rows when a patient has multiple treatments at the same visit.

I have mentioned that a patient can have multiple treatments at any one visit, but also they can also have a visit without any treatment.  In this case they are still included in the result because they have not changed treatment.

What i am looking for is a count of the number of treatments a patient has had when they have not changed treatment.  The next step will be to look for patients that have only had a set of treatments.

Maybe this can not be done in a single statement?  Or maybe some dynamic sql?
0
 
LVL 47

Expert Comment

by:Dale Fye (Access MVP)
ID: 39217125
So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.

But if they have a visit with no treatments, what is the point of the visit?  Try the following, I've added a criteria to the WHERE clause to exclude those records where no treatment was performed.  This would, however eliminate those patients who only had one visit during the desired period and where no treatment occurred.

select
  patientid
, count(treatmentdate) count_treatments
, SUM(treatment_pdt) count_pdt
, SUM(treatment_laser) count_laser
, SUM(treatment_lucentis) count_lucentis
, SUM(treatment_macugen) count_macugen
, SUM(treatment_avastin) count_avastin
, SUM(treatment_triamcinolon) count_triamcinolon
, SUM(treatment_anecortaveac) count_anecortaveac
, SUM(treatment_annan) count_annan

from PatientTreatments
where (treatmentdate >= '2013-01-01' and treatmentdate < '2013-06-01')
AND ([Treatment_Laser] + [Treatment_lucentis] + [Treatment_macugen] + [Treatment_avastin] + [Treatment_triamcinolon] + [treatment_anecortaveac] + [treatment_annan]) = 0
group by
  patientid
having
SUM(treatment_pdt) count_pdt = count(treatmentdate)
OR SUM(treatment_laser) = count(treatmentdate)
OR SUM(treatment_lucentis) = count(treatmentdate)
OR SUM(treatment_macugen) = count(treatmentdate)
OR SUM(treatment_avastin) = count(treatmentdate)
OR SUM(treatment_triamcinolon) = count(treatmentdate)
OR SUM(treatment_anecortaveac) = count(treatmentdate)
OR SUM(treatment_annan) = count(treatmentdate)
0
 

Author Comment

by:soozh
ID: 39217214
Thanks for the the response.

So, If I understand you correctly, if a patient has multiple treatments in the same visit, and one of those treatment types was given in each of the other treatments during a specified period, then that person should be returned in the result set.


No this is not correct.  I am looking for patients that have only had the same type of treatment in all the visits where they have been treated.  So if they have had PDT and only PDT then we should include them...but if the had some other treatment at any time (either on a visit with PDT, or on a visit with just the new treatment they should be excluded.

But if they have a visit with no treatments, what is the point of the visit?  


We dont know if we are going to give the patient any treatment until they have been examined at the visit.  If they are lucky we can tell them to go home.  However the doctor may decide to treat them - hence the visits without treatment.

We follow the patient even when they are not being treated - in this way we know the treatment has been and still is successfull.
0
 
LVL 69

Expert Comment

by:Scott Pletcher
ID: 39217361
I thought you might have meant that, but wasn't 100% sure from the q.

I think we can just change the ORs in the HAVING to ANDs (one reason I wrote the query specifically that way):


HAVING
    SUM(Treatment_PDT) = COUNT(*) AND
    SUM(Treatment_Laser) = COUNT(*) AND
    SUM(Treatment_Lucentis) = COUNT(*) AND
    SUM(Treatment_Macugen) = COUNT(*) AND
    SUM(Treatment_Avastin) = COUNT(*) AND
    SUM(Treatment_Triamcinolon) = COUNT(*) AND
    SUM(Treatment_AnecortAveac) = COUNT(*) AND
    SUM(Treatment_Annan) = COUNT(*)
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217824
all those AND conditions will only return a match if all treatments are provided in each visit

i.e. if the SUM(of any treatment) = 0  then that patient gets excluded
0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217910
in the following: use of a CTE is purely to simplify the required filtering (i.e. the same could be achieved inside a having clause but it's just ugly that way).

please see it in action here: http://sqlfiddle.com/#!3/a961f/1 
;WITH CTE1 AS (
                SELECT
                  patientid
                             /* 1.0 is to provide decimal results later */
                , 1.0 * count(CASE WHEN (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0 THEN 1 ELSE NULL END) count_treatments
                , sum(treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) sum_treatments
                , sum(treatment_pdt) sum_pdt
                , sum(treatment_laser) sum_laser
                , sum(treatment_lucentis) sum_lucentis
                , sum(treatment_macugen) sum_macugen
                , sum(treatment_avastin) sum_avastin
                , sum(treatment_triamcinolon) sum_triamcinolon
                , sum(treatment_anecortaveac) sum_anecortaveac
                , sum(treatment_annan) sum_annan
                FROM PatientTreatments

                   /* modify where clause to suit, e.g. data range */
                WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')

                   /* nb excluding "no treatment" records here to avoid divide by zero later */
                AND (treatment_pdt + treatment_laser + treatment_lucentis + treatment_macugen + treatment_avastin + treatment_triamcinolon + treatment_anecortaveac + treatment_annan) > 0
                GROUP BY
                  patientid
             )
SELECT
*
FROM CTE1
WHERE sum_pdt       /count_treatments IN (0.0,1.0)
AND sum_laser       /count_treatments IN (0.0,1.0)
AND sum_lucentis    /count_treatments IN (0.0,1.0)
AND sum_macugen     /count_treatments IN (0.0,1.0)
AND sum_avastin     /count_treatments IN (0.0,1.0)
AND sum_triamcinolon/count_treatments IN (0.0,1.0)
AND sum_anecortaveac/count_treatments IN (0.0,1.0)
AND sum_annan       /count_treatments IN (0.0,1.0)
;

Open in new window

0
 
LVL 48

Expert Comment

by:PortletPaul
ID: 39217916
mmm,
line 6 is optional
line 5 could probably just be

, 1.0 * count(*) as count_treatment

(it survived from an earlier iteration and I overlooked changing it)
0
 
LVL 48

Accepted Solution

by:
PortletPaul earned 250 total points
ID: 39218119
>>The next step will be to look for patients that have only had a set of treatments.
this may seem like overkill, but perhaps the following may set you up for easier analysis of treatment sets (by "unpivoting" the data). Note if you do pursue this route it may pay to move the unpivoted data into a (temp?) table with appropriate indexes - here I've just used CTEs to illustrate
;WITH
CTE1 AS (
            SELECT
                PatientId
              , TreatmentDate
              , TreatedWith
              , isTreated
            FROM PatientTreatments
            CROSS apply (
                          VALUES
                           ('pdt'         ,Treatment_PDT)
                          ,('laser'       ,Treatment_Laser)
                          ,('lucentis'    ,Treatment_Lucentis)
                          ,('macugen'     ,Treatment_Macugen)
                          ,('avastin'     ,Treatment_Avastin)
                          ,('triamcinolon',Treatment_Triamcinolon)
                          ,('anecortaveac',Treatment_AnecortAveac)
                          ,('annan'       ,Treatment_Annan)
                        ) AS CA1(TreatedWith, isTreated)

                     /* modify where clause to suit, e.g. data range */
             WHERE (treatmentdate >= '2013-01-01' AND treatmentdate < '2013-06-01')
             )
, CTE2 AS (
            SELECT
                PatientId
              , TreatedWith
              , count(DISTINCT TreatmentDate) AS count_dates
              , sum(isTreated)                AS sum_treatments
             FROM CTE1
             WHERE istreated = 1
             GROUP BY
                PatientId
              , TreatedWith
          )
SELECT
  PatientId
, min(count_dates)    as count_dates
, sum(sum_treatments) as sum_treatments
FROM CTE2
GROUP BY
  PatientId
HAVING sum(count_dates) = sum(sum_treatments)

Open in new window

see this at: http://sqlfiddle.com/#!3/2bd98/1
0
 
LVL 69

Assisted Solution

by:Scott Pletcher
Scott Pletcher earned 250 total points
ID: 39220069
CORRECTION:

Sorry, quite right, I rushed my previous HAVING code too much.  Please try this:


SELECT
    PatientId,
    SUM(Treatment_PDT) AS Treatment_PDT,
    SUM(Treatment_Laser) AS Treatment_Laser,
    SUM(Treatment_Lucentis) AS Treatment_Lucentis,
    SUM(Treatment_Macugen) AS Treatment_Macugen,
    SUM(Treatment_Avastin) AS Treatment_Avastin,
    SUM(Treatment_Triamcinolon) AS Treatment_Triamcinolon,
    SUM(Treatment_AnecortAveac) AS Treatment_AnecortAveac,
    SUM(Treatment_Annan) AS Treatment_Annan
FROM dbo.PatientTreatments
WHERE
    --for the past ~12 months only
    TreatmentDate >= '20120604' AND
    TreatmentDate < '20130604'    
GROUP BY
    PatientId
HAVING
    (SUM(Treatment_PDT) = COUNT(*) OR SUM(Treatment_PDT) = 0) AND
    (SUM(Treatment_Laser) = COUNT(*) OR SUM(Treatment_Laser) = 0) AND
    (SUM(Treatment_Lucentis) = COUNT(*) OR SUM(Treatment_Lucentis) = 0) AND
    (SUM(Treatment_Macugen) = COUNT(*) OR SUM(Treatment_Macugen) = 0) AND
    (SUM(Treatment_Avastin) = COUNT(*) OR SUM(Treatment_Avastin) = 0) AND
    (SUM(Treatment_Triamcinolon) = COUNT(*) OR SUM(Treatment_Triamcinolon) = 0) AND
    (SUM(Treatment_AnecortAveac) = COUNT(*) OR SUM(Treatment_AnecortAveac) = 0) AND
    (SUM(Treatment_Annan) = COUNT(*) OR SUM(Treatment_Annan) = 0)
0
 

Author Closing Comment

by:soozh
ID: 39228917
I decided to go down the route of unpivoting the data.  

I have a followup question which i will ask in a new thread.  

Thanks for the fantastic help.
0

Featured Post

Visualize your virtual and backup environments

Create well-organized and polished visualizations of your virtual and backup environments when planning VMware vSphere, Microsoft Hyper-V or Veeam deployments. It helps you to gain better visibility and valuable business insights.

Question has a verified solution.

If you are experiencing a similar issue, please ask a related question

Suggested Solutions

In this article I will describe the Backup & Restore method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
In this article I will describe the Copy Database Wizard method as one possible migration process and I will add the extra tasks needed for an upgrade when and where is applied so it will cover all.
Attackers love to prey on accounts that have privileges. Reducing privileged accounts and protecting privileged accounts therefore is paramount. Users, groups, and service accounts need to be protected to help protect the entire Active Directory …

740 members asked questions and received personalized solutions in the past 7 days.

Join the community of 500,000 technology professionals and ask your questions.

Join & Ask a Question